Search papers, labs, and topics across Lattice.
GeoAgent is introduced to improve geolocation reasoning by addressing limitations of prior RL-based methods that rely on AI-generated CoT data conflicting with geographic characteristics. The authors create GeoSeek, a new geolocation dataset with expert-annotated CoT data, and introduce a geo-similarity reward and a consistency reward to guide training. Experiments demonstrate GeoAgent's superior performance over existing methods and VLLMs, producing human-aligned reasoning.
Expert-annotated geographic reasoning data and specialized rewards unlock superior geolocation performance in RL agents, outperforming even large VLLMs.
This paper presents GeoAgent, a model capable of reasoning closely with humans and deriving fine-grained address conclusions. Previous RL-based methods have achieved breakthroughs in performance and interpretability but still remain concerns because of their reliance on AI-generated chain-of-thought (CoT) data and training strategies, which conflict with geographic characteristics. To address these issues, we first introduce GeoSeek, a new geolocation dataset comprising CoT data annotated by geographic experts and professional players. We further thoroughly explore the inherent characteristics of geographic tasks and propose a geo-similarity reward and a consistency reward assessed by a consistency agent to assist training. This encourages the model to converge towards correct answers from a geographic perspective while ensuring the integrity and consistency of its reasoning process. Experimental results show that GeoAgent outperforms existing methods and a series of general VLLMs across multiple grains, while generating reasoning that closely aligns with humans.