Institut Polytechnique de ParisLASTIGUniv Gustave EiffelApr 13, 2026arXiv:2604.11668

UNIGEOCLIP: Unified Geospatial Contrastive Learning

Guillaume Astruc, Eduard Trulls, Jan Hosang, Loic Landrieu, Paul-Edouard Sarlin

AI Summary

UNIGEOCLIP, a novel multimodal contrastive learning framework, aligns aerial imagery, street-level views, elevation models, text, and geographic coordinates into a unified embedding space. It achieves this by performing all-to-all contrastive alignment across modalities, avoiding reliance on a central pivot representation or modality fusion. Experiments on downstream geospatial tasks show UNIGEOCLIP outperforms single-modality and coordinate-only baselines, demonstrating the value of holistic multimodal alignment.

Key Contribution

Unlock zero-shot geospatial reasoning by jointly embedding satellite imagery, street view, elevation, text, and lat/lon coordinates into a single space.

Abstract

The growing availability of co-located geospatial data spanning aerial imagery, street-level views, elevation models, text, and geographic coordinates offers a unique opportunity for multimodal representation learning. We introduce UNIGEOCLIP, a massively multimodal contrastive framework to jointly align five complementary geospatial modalities in a single unified embedding space. Unlike prior approaches that fuse modalities or rely on a central pivot representation, our method performs all-to-all contrastive alignment, enabling seamless comparison, retrieval, and reasoning across arbitrary combinations of modalities. We further propose a scaled latitude-longitude encoder that improves spatial representation by capturing multi-scale geographic structure. Extensive experiments across downstream geospatial tasks demonstrate that UNIGEOCLIP consistently outperforms single-modality contrastive models and coordinate-only baselines, highlighting the benefits of holistic multimodal geospatial alignment. A reference implementation is available at https://gastruc.github.io/unigeoclip.

Computer Vision Data Curation & Synthetic Data Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

UNIGEOCLIP: Unified Geospatial Contrastive Learning

Related Papers