Search papers, labs, and topics across Lattice.
DualGeo, a two-stage framework, addresses the challenge of worldwide image geo-localization by first fusing image and semantic segmentation features using bidirectional cross-attention and aligning them with GPS coordinates via dual-view contrastive learning. This creates a global retrieval database, which is then refined by re-ranking candidates using geographic clustering and feeding them into large multimodal models (LMMs) for final coordinate prediction. Experiments demonstrate that DualGeo significantly outperforms existing methods on IM2GPS, IM2GPS3k, and YFCC4k datasets, improving localization accuracy at both street and city levels.
Fusing image features with semantic segmentation and geographic clustering unlocks substantial gains in worldwide image geo-localization, surpassing prior methods by up to 16.58% at the street level.
Worldwide image geo-localization aims to infer the geographic location of an image captured anywhere on Earth, spanning street, city, regional, national, and continental scales. Existing methods rely on visual features that are sensitive to environmental variations (e.g., lighting, season, and weather) and lack effective post-processing to filter outlier candidates, limiting localization accuracy. To address these limitations, we propose DualGeo, a two-stage framework for worldwide image geo-localization. First, it establishes a geo-representational foundation by fusing image and semantic segmentation features via bidirectional cross-attention. The fused features are then aligned with GPS coordinates through dual-view contrastive learning to build a global retrieval database. Second, it performs geo-cognitive refinement by re-ranking retrieved candidates using geographic clustering. It then feeds them into large multimodal models (LMMs) for final coordinate prediction. Experiments on IM2GPS, IM2GPS3k, and YFCC4k show that DualGeo outperforms state-of-the-art methods, improving street-level (<1 km) and city-level (<25 km) localization accuracy by 3.6%-16.58% and 1.29%-8.77%, respectively. Our code and datasets are available : https://github.com/CJ310177/DualGeo.