Search papers, labs, and topics across Lattice.
The paper introduces Wrivinder, a zero-shot framework that aligns ground-level multi-view imagery with satellite imagery by reconstructing a 3D scene using SfM, 3D Gaussian Splatting, semantic grounding, and monocular depth cues to generate a zenith-view rendering. This rendering is then directly matched to satellite imagery for camera geo-localization, addressing the challenge of viewpoint gaps and unreliable GPS. The authors also contribute MC-Sat, a new dataset for evaluating ground-to-satellite image alignment, and demonstrate that Wrivinder achieves sub-30m geolocation accuracy in zero-shot experiments.
Achieve surprisingly accurate zero-shot geo-localization by cleverly combining SfM, Gaussian Splatting, and semantic grounding to bridge the ground-to-satellite viewpoint gap.
Aligning ground-level imagery with geo-registered satellite maps is crucial for mapping, navigation, and situational awareness, yet remains challenging under large viewpoint gaps or when GPS is unreliable. We introduce Wrivinder, a zero-shot, geometry-driven framework that aggregates multiple ground photographs to reconstruct a consistent 3D scene and align it with overhead satellite imagery. Wrivinder combines SfM reconstruction, 3D Gaussian Splatting, semantic grounding, and monocular depth--based metric cues to produce a stable zenith-view rendering that can be directly matched to satellite context for metrically accurate camera geo-localization. To support systematic evaluation of this task, which lacks suitable benchmarks, we also release MC-Sat, a curated dataset linking multi-view ground imagery with geo-registered satellite tiles across diverse outdoor environments. Together, Wrivinder and MC-Sat provide a first comprehensive baseline and testbed for studying geometry-centered cross-view alignment without paired supervision. In zero-shot experiments, Wrivinder achieves sub-30\,m geolocation accuracy across both dense and large-area scenes, highlighting the promise of geometry-based aggregation for robust ground-to-satellite localization.