Textual geolocation in Hebrew: mapping challenges via natural place description analysis

Authors

DOI:

https://doi.org/10.5311/JOSIS.2024.28.323

Keywords:

textual geolocation, geographic information retrieval, Hebrew, natural language processing, spatial cognition and reasoning

Abstract

Describing where a place is situated is an innate communication skill that relies on spatial cognition, spatial reasoning, and linguistic systems. Accordingly, textual geolocation, a task for retrieving the coordinates of a place from linguistic descriptions, requires computerized spatial inference and natural language understanding. Yet, machine-based textual geolocation is currently limited, mainly due to the lack of rich geo-textual datasets necessitated to train natural language models that, in-turn, cannot adequately interpret the language-based expressions. These limitations are intensified in morphologically rich and resource-poor languages, such as Hebrew. This study aims to analyze and understand the linguistic systems used for place descriptions in Hebrew, later to be used to train machine learning natural language models. A novel crowdsourced geo-textual dataset is developed, composed of 5,695 written place descriptions provided by 1,554 native Hebrew speakers. All place descriptions rely on memory only, which increases spatial vagueness and requires referring expression resolution. Qualitative linguistic analysis of place descriptions shows that geospatial reasoning is greatly used in Hebrew, while empirical analysis with textual geolocation engines indicates that literal descriptions pose challenges for existing methods, as they require real understanding of space and geospatial references and cannot simply be geolocated by matching gazetteer with textual geo-entity extractions. The findings offer improved understanding of the challenges entailed in natural language processing of Hebrew geolocation, contributing to formalizing computerized systems used in future machine learning models for complex geographic information retrieval tasks. 

Author Biographies

Tal Bauman, The Technion

Tal Bauman is a PhD Candidate in Geoinformatics at the Technion in Israel, where he is researching ways to improve geographic information retrieval. Under the supervision of Dr. Sagi Dalyot, Tal is developing innovative methodologies to collect and process human descriptions of places, creating a new spatial index and advanced retrieval techniques. His goal is to enhance the accuracy of geolocating literal descriptions, bringing a new level of understanding to geographic information derived from human descriptions.

Tzuf Paz-Argaman, Ber-Ilan University

Tzuf Paz-Argaman is a PhD Candidate at Ber-Ilan University, under the supervision of Prof. Reut Tsarfaty. Her research is in the field of natural language processing and focuses on grounded and executable semantic parsing, spatial reasoning, zero-shot learning, and multimodality.

Itai Mondshine, Bar Ilan University

Itai Mondshine is a MSc Computer Science student at Bar Ilan University and a Full Stack Developer. With a prior computer science degree from The Hebrew University of Jerusalem, Itai is a skilled professional with research interests in Natural Language Processing, Machine Learning, and multi-modal problems like geography and language. He is dedicated to advancing his knowledge and skills in this exciting field.

 

Reut Tsarfaty, Ber-Ilan University

Reut Tsarfaty is Associate Professor at Bar-llan University, leading the Open Natural Language Processing research lab (The ONLP Lab). Her research focuses on natural language processing broadly interpreted to cover morphological, syntactic, and semantic phenomena, extended for the analysis of typologically different languages. She is an international expert on models for morphologically rich languages (MRLs) in general and on Hebrew NLP in particular. a founder and instigator of the SPMRL community and shared tasks, a member of the UD steering committee, and an expert at the UniMorph community. Applications Reut has worked on include (but are not limited to) natural language programming, natural language navigation, automated essay scoring, analysis, and generation of social media content, and more. Reut's research is funded by an ERC-Starting-Grant #677352, an ISF grant #1739/26, a grant by MOST (Israel's ministery of science and eduction) and previous Faculty Research Awards (FRA) from Google.

Itzhak Omer, Tel Aviv University

Itzhak Omer is a professor of urban geography in the Department of Geography and Human Environment and Head of the Urban Space Analysis Laboratory at Tel Aviv University. His research conducts on spatial behavior in an urban environment with a focus on the relationship between movement and the morphological, functional, and environmental characteristics of the urban Environment. The areas of his academic interest include spatial behavior, spatial perception and cognition, urban morphology, urban movement, pedestrian modeling, agent-based urban models, urban systems, Israeli cities.

Sagi Dalyot, The Technion

Sagi is a faculty member at the Mapping and Geoinformation Engineering, Civil and Environmental Engineering Faculty, The Technion. His main research is geodata science, dealing. with data handling of crowdsourced user-generated content and volunteered geographic information, with a focus on the context of location-based services, wayfinding, and navigation.

323

Downloads

Published

2024-06-27

Issue

Section

Research Articles