UDC: 
DOI: 
10.22389/0016-7126-2026-1029-3-19-30
1 Vorobev A.V.
2 Vorobeva G.R.
Year: 
№: 
1029
Pages: 
19-30

RAS Geophysical Center

1, 

Ufa University of Science and Technology

2, 
Abstract:
The authors discuss developing an end-to-end methodology for converting weakly-structured spatial data into standardized geospatial products. The relevance of the study is driven by the problem of integrating heterogeneous information characterized by weak structural arrangement, heterogeneous coordinate formats, and polymorphic attribute characteristics. A comprehensive approach based on scripted processing tools is proposed; it includes formalization of a mathematical apparatus based on set theory, creation of data parsing and normalization algorithms, and constructing a multi-stage technological pipeline. Specialized algorithms were developed for attribute transliteration, territorial data aggregation, parsing of multi-format coordinates, and integration with reference geospatial layers. The method was implemented as a Python-based software complex and tested during creation of a web GIS for monitoring the geomagnetic situation. The solution ensures automation of processing workflows, elimination of subjective errors, and generation of ready-to-use GeoJSON layers suitable for analysis and visualization in modern GIS environments
The research was supported by grant No. 25-21-00143 from the Russian Science Foundation (https://rscf.ru/project/25-21-00143/)
References: 
1.   Gvishiani A. D., Dobrovol'skii M. N., Dzeranov B. V., Dzeboev B. A. Bol'shie dannye v geofizike i drugikh naukakh o Zemle. Fizika Zemli, 2022, no. 1, pp. 3–34. DOI: 10.31857/S0002333722010033.
2.   Karol' A. D., Tityunnikov A. V., Besschetnov A. V. Metody territorializatsii: kartograficheskoe issledovanie. Problemy nauki, 2019, no. 5 (41), pp. 44–49.
3.   Nevzorova O. A., Khakimullin R. R., Idrisov I. I. Tsifrovaya nauchnaya platforma «Agregator nestrukturirovannykh geologo-promyslovykh dannykh»: arkhitektura i bazovye modeli izvlecheniya dannykh. Georesursy, 2023, no. 4, pp. 149–162. DOI: 10.18599/grs.2023.4.13.
4.   Cherkasov A. A., Makhmudov R. K., Sopnev N. V. Prostranstvennyi analiz gorodov i aglomeratsii: integratsiya tekhnologii GIS i Big Data. Nauka. Innovatsii. Tekhnologii, 2021, no. 4, pp. 95–112. DOI: 10.37493/2308-4758.2021.4.6.
5.   Alam M., Uz Z. K., Miraz M. (2025) AstuteRAG-FQA: Task-Aware Retrieval-Augmented Generation Framework for Proprietary Data Challenges in Financial Question Answering. Annals of Emerging Technologies in Computing, Volume 9, no. 5, pp. 13–31.
6.   Aiken P., Thebault ≈., Beggan C. D. et al. (2021) International Geomagnetic Reference Field: the thirteenth generation. Earth Planets Space , no. 73 (49), DOI: 10.1186/s40623-020-01288-x.
7.   Bagirova L. F., Quliyeva N. C., Ibrahimova I. Q. (2023) Elements of set theory and new methods of their study. Colloquium-journal, no. 28 (187), pp. 21–23. DOI: 10.24412/2520-6990-2023-28187-21-23.
8.   Bill R., Blankenbach J., Breunig M. et al. (2022) Geospatial Information Research: State of the Art, Case Studies and Future Perspectives. PFG – Journal of Photogrammetry, Remote Sensing and Geoinformation Science, no. 90, pp. 349–389. DOI: 10.1007/s41064-022-00217-9.
9.   Fosci P., Psaila G. (2023) Soft Querying Features in GeoJSON Documents: The GeoSoft Proposal. International Journal of Computational Intelligence Systems, no. 16 (1), DOI: 10.1007/s44196-023-00325-3.
10.   Gjerloev J. W. (2012) The SuperMAG data processing technique. Journal of Geophysical Research, no. 117, A09213, DOI: 10.1029/2012JA017683.
11.   Iwashokun O., Ade-Ibijola A. (2024) Parsing of Research Documents into XML Using Formal Grammars. Applied Computational Intelligence and Soft Computing, no. 6671359, DOI: 10.1155/2024/6671359.
12.   Ji X., Cao Y., Zhang J., Zhao X. (2025) STSE: Spatio-temporal state embedding for knowledge graph completion. Knowledge-Based Systems, Volume 317, no. 113469, DOI: 10.1016/j.knosys.2025.113469.
13.   Jin Y., Chen D., Zhu H., Xie M. (2025) An efficient knowledge graph storage framework with adaptive query processing and online updating. World Wide Web, no. 28 (6), DOI: 10.1007/s11280-025-01384-6.
14.   Nahrstedt F., Karmouche M., Bargieł K., Banijamali P., Kumar A. N. P., Malavolta I. (2024) An Empirical Study on the Energy Usage and Performance of Pandas and Polars Data Analysis Python Libraries. Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering (EASE `24). Association for Computing Machinery, New York, USA, pp. 58–68. DOI: 10.1145/3661167.3661203.
15.   Newell P. T., Gjerloev J. W. (2011) Evaluation of SuperMAG auroral electrojet indices as indica-tors of substorms and auroral power. Journal of Geophysical Research, no. 116, A12211, DOI: 10.1029/2011JA016779.
16.   Stefanakis E. (2002) Representation of Map Objects with Semi-Structured Data Models. Advances in Spatial Data Handling, Springer, Berlin, Heidelberg, pp. 547–562. DOI: 10.1007/978-3-642-56094-1_40.
17.   Stockus A., Bouju A., Bertrand F., Boursier P. (1999) Integrating GPS data within embedded Internet GIS. Proceedings of the 7th ACM international symposium on Advances in geographic information systems (GIS `99), pp. 34–139. DOI: 10.1145/320134.320168.
18.   Teyssier J.-P., de Groote F., Mayeret M., Schlemko P., Rumantsev A., Khutko V., Gesche R., Doerner R. (2007) The XML file format as a general solution for measurement data storage and exchange. 69th ARFTG Conference. Honolulu, HI, USA: IEEE, DOI: 10.1109/ARFTG.2007.5456319.
19.   Vorobev A., Vorobeva G. (2018) Inductive Method of Geomagnetic Data Time Series Recovering. SPIIRAS Proceedings, no. 2 (57), pp. 104–133. DOI: 10.15622/sp.57.5.
20.   Vorobev A. V., Vorobeva G. R. (2024) An approach to dynamic visualization of heterogeneous geospatial vector images. Computer Optics, Volume 48, no. 1, pp. 123–138. DOI: 10.18287/2412-6179-CO-1279.
21.   Wachs A., Zacharatou E. T. (2024) Analysis of Geospatial Data Loading. Proceedings of the Tenth International Workshop on Testing Database Systems (DBTest `24), Association for Computing Machinery, New York, NY, USA, pp. 36–42. DOI: 10.1145/3662165.3662761.
22.   Yuan G., Lu J., Yan Z., Wu S. (2023) A Survey on Mapping Semi-Structured Data and Graph Data to Relational Data. ACM Computing Surveys, no. 55 (10), pp. 1–38. DOI: 10.1145/3567444.
Citation:
Vorobev A.V., 
Vorobeva G.R., 
(2026) An approach to converting weakly-structured data into thematic geospatial layers. Geodesy and cartography = Geodeziya i Kartografiya, 87(3), pp. 19-30. (In Russian). DOI: 10.22389/0016-7126-2026-1029-3-19-30
Publication History
Received: 08.12.2025
Accepted: 18.02.2026
Published: 20.04.2026

Content

2026 March DOI:
10.22389/0016-7126-2026-1029-3