Journal of Global Change Data & Discovery2021.5(4):363-372

[PDF] [DATASET]

Citation:Bai, J. T., Pan, W., Hou, Y. J., et al.Building an Integrated Toponymic Information System Based on TGIS and Big Data Technology[J]. Journal of Global Change Data & Discovery,2021.5(4):363-372 .DOI: 10.3974/geodp.2021.04.01 .

Building an Integrated Toponymic Information System Based on TGIS and Big Data Technology

Bai, J. T.1  Pan, W.2*  Hou, Y. J.3  Zuo, Y. Q.1  Yang, H. H.1

1. Xi’an Map Publishing House, Xi’an 710054, China;

2. Collage of History & Archives, Yunnan University, Kunming 650091, China;

3. Center for Historical Environment and Socio-Economic Development in Northwest China of Shaanxi Normal University, Xi’an 710062, China

 

Abstract: Storing toponyms and their historical changes by employing a scientific approach allows querying them for different historical stages and realizing their visual representation. Such a toponymic information system can be exploited for historical and geographical research by integrating temporal and spatial information utilized for demonstration and analysis tasks. Thus, this paper builds a toponym database by combining a spatio-temporal framework and big data technology with a Temporal Geographic Information System (TGIS). The proposed method supports updating and maintaining toponymic information and attempts to create a modern information technology that maintains and broadcasts toponymic information. Specifically, we built a toponym database including spatial and temporal information and an integrated toponymic information system that incorporates tools to analyze the information included, along with a toponym updating mechanism based on big data analysis. Additionally, our framework provides solutions for various research projects to reference or confirm each other’s results, improving the extent and depth of public services based on the relevant results and promoting the integrated utilization of research results. Finally, our toponymic system affords revealing historical toponymic information and meeting the asymmetrical and complementary demands.

Keywords: TGIS; toponym; spatio-temporal framework; visualization; analysis

DOI: https://doi.org/10.3974/geodp.2021.04.01

CSTR: https://cstr.escience.org.cn/CSTR:20146.14.2021.04.01

1 Introduction

Toponyms are classified as essential geographical information and are often used to access a more comprehensive information range. With the advent of the big data analysis era, toponyms have increasingly been accepted as a critical tool to access data resources for academic research, information query, and information interactions[3,4]. The rapidly evolving progress in several aspects of society, combined with the advancement of big data analysis, artificial intelligence technologies, and the easiness to access the internet, imposes quick updates of toponyms and massively transmitting toponymic data through the internet. Despite the toponyms’ importance, their changes over space and time have not been explored yet[1]. Therefore, toponym management and its related services have become challenging recently, as staffs responsible for collecting, updating, utilizing, maintaining, and managing toponymic data have to deal with several emerging problems[2].

Toponyms possess three essential features: space, attributes, and time associated with general geographical phenomena[4]. Temporal toponymic information is irregularly stored or lost due to incomplete records and historical data storage[5]. Nevertheless, mastering historical data of geographical objects allows analyzing current data and retrace the geographical objects’ history to ultimately determine regularities and predict future development[6–9]. Spurred by the advantages of maintaining accurate toponymic information, this research introduces big data analysis technology to build an integrated toponymic information system. Specifically, we build a spatio-temporal toponym database by adopting semantic analysis and automatic update technology that relies on big data analysis and spatio-temporal data modeling theory. Furthermore, we set up an integrated toponymic information system acting as a tool for spatio-temporal analysis and as a mechanism to update toponyms.

2 Logical Framework of the System

A spatio-temporal toponym database based on big data technology and a Temporal Geographic Information System (TGIS) can provide a framework to establish an integrated information system[10,11]. The database shall provide researchers fundamental scientific data and the appropriate tools to search for toponyms and their relevant fields to deliver better social services. The system framework diagram is illustrated in Figure 1.

3 Structure of Spatio-temporal Toponymic Data

Considering toponyms being a type of geographical expression involving nature and human society attributes, automatic toponym updating has to rely on a spatio-temporal database of toponyms and an analysis of the spatio-temporal domains. Relevant data should be obtained from different fields, but different data demands across various industries and fields impose differences in historical information mining and description. These differences should be excluded, narrowed, or employed to confirm each other through semantic analysis and then act as determinants or supporting evidence during the maintenance and update of the historical toponymic information. The toponymic spatio-temporal domains often vary randomly, and thus, the spatio-temporal state and grid model have been proven quite effective in organizing and managing toponymic Spatio-temporal data[7]. To facilitate the spatio-temporal analysis and visual presentation of historical toponyms, we built a spatio-temporal database based on the GIS spatial topological relation theory, whose conceptual design is presented in Figure 2.

4 Data Processing

Generating the second-level district grids is the primary modeling procedure, which exploits modern information technology to store all toponymic spatio-temporal information and modern cartographic techniques to visualize the toponymic data. This modeling algorithm improves the toponymic information storage efficiency and affords information query and integrated analysis. We adopt the traditional administrative division to fully exploit the

 

Figure 1  Framework of the spatio-temporal toponym database

 

 

Figure 2  Flow chart of the database development

algorithm’s advantages and build the basic expression units. Grids that are interconnected within an administrative district constitute a basic expression unit, for which the system only needs to record the position codes of two grids: the first and last ones. Grids are coded from left to right by row according to the attributes of toponyms and grid locations. Grids that are interconnected and have the same toponymic attributes constitute a district. If a grid is located at the boundary of the coding district, it will be classified into the district with the district’s boundary being redefined. Otherwise, a new district is defined, and the grid will be used to delimit the boundary of the new district.

The core of the second-level district indexing is processing toponymic units represented by polygon vector data. The data processing results are second-level districts, which are then coded and recorded, and the logical connections between first-level and second-level grids are identified. Second-level districts are generated employing the Union tool to deal with administrative district units and district map layers. Every second-level district is then coded through loop computation, the logical connections between the first-level and second-level districts are identified, and their codes are recorded as the digital basis of the district relationship table. It should be noted that during this process, the attribute information needs to be maintained to calculate, identify, label, store, and rank the second-level districts’ grid coordinates.

5 Analysis, Statistics, and Processing of Spatio-temporal Information of Toponyms

The spatio-temporal analysis mainly includes locating toponyms in space, along with toponymic extraction, merging their temporal sequences, calculating their frequency statistics at specific locations, calculating their accumulative lifetime, and generating their spatio-temporal volume matrix.

5.1 Locating Toponyms in Space

When a new toponym is captured by big data analysis technology, its spatial location is identified based on its spatial attributes. Then the toponym’s temporal attributes and stories are employed to confirm whether it is a new toponym. If it is a new one, a new toponym record should is set. Otherwise, it is defined as historical information of an existing toponym and thus is stored under the previous toponym.

5.2 Toponym Extraction

According to relevant standards and regulations, toponyms of administrative districts generally comprise two names: proprietary and standard. Most databases store complete toponyms of administrative districts, while the proposed algorithm only extracts proprietary names and records them as toponyms.

5.3 Merging Time Intervals of Toponyms

Changes of toponyms related to administrative districts commonly involve their common names, e.g., “township” changed into “town”, but proprietary names are rarely altered. For toponyms sharing the same proprietary name, if they refer to the same place, their time spans should be merged, and the toponym should be stored in the toponymic record. The process of information merging is as follows: temporal attributes and common names of coded grids are recorded as the time field and are then ranked. Adjacent grids sharing the same toponym whose time is successive have their records merged, with the start and end times of the toponyms determined based on the time sequences of the grids. If time is ambiguous, then a synchronous record is created, while if the adjacent grids have no such connections, the next record will be checked.

5.4 Statistics of Toponym Frequency of Specific Locations

A specific grid may include several historical toponyms, potentially being repeatedly adopted from time to time. All grids are sequentially dealt with, and all grid records are stored in a timetable with the toponyms ranked based on their starting time. If one toponym has not yet been included, it is added to the storage table. Otherwise, it will be dealt with during the following time interval. This process continues until all grids are considered. Meanwhile, the boundaries and the total number of toponyms per grid are recorded to form a matrix comprising grid locations and the total numbers of toponyms. The toponyms’ changes related to target districts and their coverage are revealed by combining the initial grid data.

5.5 Toponym Lifetime

According to the toponyms’ spatio-temporal information statistics, the number of toponym repetitions, the lifetime of each repetition, and the cumulative lifetime of one toponym are three significant indicators revealing the historical, cultural, and regional changes of a place. The lifetime of a toponym refers to the time duration of a toponym’s spatio-temporal state, with the number of repetitions and accumulative toponym lifetime involving more complex definitions. The accumulative lifetime refers to the sum of the same toponym’s repetition time intervals at the same location. If toponyms are interrupted, the start and end time of a toponym cycle should be recorded and the number of interruptions determines the number of toponym repetitions. Nested loop tools can be used to deal with every toponym and its spatio-temporal state.

5.6 Generation of Spatio-temporal Volume Matrix of Toponyms

The spatio-temporal volume matrix involving the spatial coverage of a toponym is the primary target of the toponyms’ spatio-temporal analysis. The spatio-temporal volume matrix pipeline is illustrated in Figure 3. According to the district classification table and the toponym spatio-temporal state table obtained through toponym retrieval, the spatio-temporal state sequence and district coverage can be obtained. Then the row and column limits of the toponym’s spatial coverage are calculated through nested loop processing, and the spatio-temporal state of every grid is retrieved and added to the accumulative lifetime of the toponym. The spatio-temporal volume matrix of the toponym is obtained when the cycle is completed.

6 Visualization of Changes of Spatio-temporal Toponymic Information

Given the known toponymical data characteristics, semantic analysis exploits big data analysis information acquired from different industries. Hence, combining the semantic analysis results with a toponym’s historical information can confirm whether the toponym is newly added, altered, removed, or disappears and affords to obtain the spatio-temporal change process and toponym’s state. To visually represent the dynamic process statically, several designs are available to represent spatial states, including dot, line, and plane-type toponyms based on a selection of visual variables such as shape, size, and color.

Considering that a toponym change often lacks regularity, static visualization can better describe the spatio-temporal changes of toponyms, with the latter classified into three categories according to their spatial states: dot, line, and plane. To represent toponyms, visual symbols should be selected to present shape, color, and size variables. The similarities and differences before and after the toponym changes should also be considered during the design to highlight this dynamic process.

 

 

Figure 3  Flow chart of generation of spatio-temporal volume matrix of place names

 

6.1 Visualization of Spatio-temporal Changes of Dot-type Toponyms

 

Figure 4  Example of spatio-temporal changes of dot-

type toponyms

Toponyms may change concerning two aspects: spatial location and attributes. Spatially, toponyms involve three types of changes: addition, cancellation, and transfer. The first two change types are easily denoted through symbols, while for the third one, lines are used to indicate the locations before and after the “transfer” and are combined with different colors to create a visual hierarchy. These are easier visualized regarding the attribute changes, as the toponym grade change is generally represented by altering the symbol’s size or style. However, if changes are more complex, symbols are extended to enhance visualization and exhibit the toponym attributes’ changes based on symbol alteration. An example is illustrated in Figure 4.

6.2 Visualization of Spatio-temporal Changes of Line-type Toponyms

In this work, we utilize lines or colors to visualize spatial location changes of the line-type toponyms. Figure 5 presents road location changes, mainly including extension and rerouting.

The line-type toponyms’ attributes changes are more flexibly visualized utilizing different lines, colors, and line width combined with annotations (Figure 6).

 

Figure 5  Example of visualization of spatial changes of a toponym

 

Figure 6  Example of visualization of changes of toponym changes

6.3 Visualization of Spatio-temporal Changes of Plane-type Toponyms

Plane-type toponyms present changes in coverage shape, area, and attributes, represented by visual tools such as color, shape, and map layers. This paper focuses on the changes in administrative districts. It is assumed that streets should fill one county-level administrative district, roads, towns, and residential communities without voids or overlaps in space, and administrative districts at the same level should be near to, border on, or separate from each other space.

 

Figure 7  Example of visualization of location changes of plane-type toponyms

The spatial coverage of a plane-type toponym often changes in multiple directions, which are generally accompanied by the change of shape or hierarchy and its attributes. Visualizing plane-type toponyms is more complicated than line toponyms, and the effect of visualization varies considerably, which is closely related to the scale and size of the districts. Different colors represent administrative districts with different toponyms, and the hierarchy of toponyms is related to the temporal sequences of their adoptions: the later a toponym is adopted, the higher its hierarchy. This strategy affords directly showing the state of a plane-type toponym in a specific time and its changing process (Figure 7).

Annotations or symbols can be added or altered in the mapping space to visualize plane-type toponym attribute changes. For example, as shown in Figure 8, gross domestic product (GDP) changes of agriculture, industry, and service sectors in a specific administrative district can be visualized.

When many attributes of a plane-type toponym change, such as shape and boundary, several visualizations should be combined to represent the complicated change, such as simultaneously adopting color and hierarchy. Hence, we adopt layer opacity to display the toponyms hierarchy as an auxiliary visualization method in this work. When maps of different time points overlap, and according to the user’s discretion, some map layers can be displayed or hidden to demonstrate a hierarchy, can be shown by reducing or eliminating interruptions between map layers and expressing target information accurately. Through adjusting opacity, problems related to information expression due to multiple overlapping layers are solved. More complicated changes can also be demonstrated through information charts. Figure 9 illustrates an example of a plane-type toponym visualization.

7 Functions of Integrated Spatio-temporal Toponymic Information System and Its Management

 

Figure 8  Example of visualization of attributes of a plane-type toponym

7.1 Major Functions of the Integrated Spatio-temporal Toponymic Information System

The database comprises nationwide or even worldwide fundamental and historical toponym data of Shaanxi province. Current historical data originate from national historical maps in the Historical Atlas of China and historical maps of Shaanxi province of different dynasties since the Qin dynasty. The historical information employed includes 9000 pairs of Shaanxi ancient and current toponyms for comparison, 65,000 important heritage sites in Shaanxi province, more than 1,000 relics, nearly 200 traditional architectures, more than 100 ancient tombs, more than 30 religious and cultural sites, and nearly 10,000 locations mentioned in poems. All toponymic information is dynamically updated by experts who register the platform, with the update speed being subjected to the toponym management approval authorities and research achievement publication. The system is updated with the latest research results utilizing data from different research institutions.

 

Figure 9  Example of visualization of overall changes of a plane-type toponym

This research adopts big data analysis technology to obtain research results of ancient toponyms in different industries and fields, ultimately matching ancient against modern ones. Then the matching results are comprehensively analyzed to extract the toponyms’ changes and the stories related to each toponym and store them in a related database. The proposed architecture is illustrated in Figure 10.

The Spatio-temporal toponymical information database primarily comprises historical map data of well-known cultural cities in Chinese history, ancient map data, historical images, and spatio-temporal toponymic data of China. The database is based on technical regulations of data input and analysis and integrates a cloud model scheme. The database provides tools for online mapping and visual representation. The proposed architecture can be generalized to a platform demonstrating the results of a relevant field, exchange ideas, and manage data, and also it can be exploited by history and geography researchers for sustainable data mining and production based on GIS. The integrated spatio-temporal toponymic information system based on the database has the functions of collection, management, updating, analysis, visual presentation, downloading, and sharing of toponyms, meeting the requirements of various fields on query and utilization tasks related to reliable historical toponymic information. Some functions are presented in Figure 11.

 

Figure 10  Application of big data technology in updating toponymical information

 

数字地图-3    数字地图-5(可视区域分析)

a. Dataset of historical toponyms                       b. Visible domain analysis

数字地图-4(诗词热度分析)    数字地图-6(剖面分析)

c. Hot spot analysis                        d. Cross-section analysis

 

Figure 11  Example display of functions of the integrated spatio-temporal toponymical information system

7.2 Management of the Integrated Spatio-temporal Toponymic Information System

This work is part of a project funded by the Ministry of Finance under the “2016 state- owned capital management budget for state-owned cultural enterprises”, categorized as the state-owned economy structural adjustment expenditures with number 2230201. Xi’an Cartographic Publishing House is responsible for launching and maintaining the project, and participating organizations include the History and Geography Institute of Shaanxi Normal University, Xi’an International Studies University, Baoji University of Arts and Sciences, Capital Normal University, China Institute of Urban and Rural Construction and Cultural Heritage of Xi’an University of Architecture and Technology, Xi’an Civil Affairs Bureau, Xi’an Toponym Association, and Shaanxi Provincial Library. More organizations may be invited to participate in the project according to work demands.

8 Discussion and Conclusion

The proposed system allows updating and searching historical toponyms based on big data analysis technology and TGIS theory, revealing the value of toponyms and the relevant resources. Our method can easily be extended as a standard data and search tool for relevant researchers. Additionally, the results obtained can be edited into popular books to broadcast the related cultures and knowledge to the public online and offline.

The integrated spatio-temporal toponymic information system based on TGIS and big data analysis technology integrates toponym data acquisition, intelligent database maintenance, static visualization of dynamic data process, toponym query and assistant analysis, and display and visualization of toponym data or the related results. The proposed intelligent toponym update theory based on big data analysis technology supports querying and displaying historical toponyms and is the core process for the spatio-temporal toponymic information framework and database presented in this work.

Future work shall involve expanding the database’s spatial coverage to store more historical toponyms and along with their relevant data, improving the functions of intelligent matching ancient with current toponyms, developing toponym information indexing based on research results of ancient maps and literature, and creating a server to publicly broadcasting stories of toponyms and focusing on toponyms with cultural significance. Finally, data acquisition from toponym research within institutes or management departments should be gradually improved.

 

Author Contributions

Bai, J. T. and Pan, W. designed the algorithms of historical place name comprehensive information system; Zuo, Y. Q., Yang, H. H. and others contributed to the data processing and analysis; Bai, J. T. designed the model and algorithm; Zuo, Y. Q. made data verification; Bai, J. T. wrote the paper.

 

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]      Huang, D. N. Building the query and management system about toponym temporal data based on TGIS [D]. Fuzhou: Fuzhou University, 2014.

[2]      Wang, D. P. Design and implementation of Guiyang digital place name public service platform [D]. Chengdu: University of Electronic Science and technology, 2010.

[3]      Ren, D. F., Xu A. G., Zhu, Y. J. Establishment of Fuxin place name query system [J]. Geospatial Information, 2011, 9(6): 107-110, 1.

[4]      Liu, L., Lu, J. S. Design and implementation of geographic name information touch query system based on GIS [J]. Surveying and Mapping and Spatial Geographic Information, 2012, 35 (10): 110‒112.

[5]      Zhang, B. G., Wang, R. S., Gao, L. Spatio-temporal data model of point place names [J]. Land and Resources Remote Sensing, 2005(4): 82-85.

[6]      Zhou, Y. X., Fu, Z., Liu, D. W., et al. Study on temporal and spatial process of soil desertification, salinization and grassland degradation in Western Jilin province [J]. Journal of Jilin University (Geoscience Edition), 2003 (3): 348-354.

[7]      Fu, X. Q. Research on spatiotemporal state data model of historical geographical names [D]. Shijiazhuang: Hebei Normal University, 2019.

[8]      Smith, B. Engaging geography at every street corner: using place-names as critical heuristic in social studies [J]. The Social Studies, 2018, 109(2): 112-124.

[9]      Choi, S. H., Wong, C. U. I. Toponymy, place name conversion and wayfinding: South Korean independent tourists in Macau [J]. Tourism Management Perspectives, 2018, 25: 13‒22.

[10]   Zhao, S. H. Design and implementation of geographic name information integration based on 3S technology [D]. Wuhan: Wuhan University, 2017.

[11]   Yao, X., Zhu, D., Ye, S., et al. A field survey system for land consolidation based on 3S and speech recognition technology [J]. Computers and Electronics in Agriculture, 2016, 127: 659-668.

[12]   Ai. J. H. Research on place name address matching algorithm [D]. Kunming: Kunming University of technology, 2019.

[13]   Tan, Q. X. Atlas of Chinese History [M]. Beijing: China Map Publishing House, 1982.

Co-Sponsors
Superintend