Mar 4, 2026arXiv:2603.04272

SSR: A Generic Framework for Text-Aided Map Compression for Localization

Mohammad Omama, Po-han Li, Harsh Goel, Minkyu Choi, Behdad Chalaki, Vaishnav Tadiparthi, Hossein Nourkhiz Mahjoub, Ehsan Moradi Pari, Sandeep P. Chinchali

AI Summary

This paper introduces Similarity Space Replication (SSR), a text-enhanced map compression framework that leverages lightweight text descriptions and small image feature vectors to reduce memory and bandwidth costs for robot localization. SSR learns an adaptive image embedding that captures information complementary to the text descriptions, enabling high-fidelity localization with significantly smaller map sizes. Experiments on datasets like TokyoVal and KITTI demonstrate that SSR achieves 2x better compression compared to existing baselines while maintaining localization performance.

Key Contribution

Text, combined with learned image embeddings, can compress maps by 2x while preserving localization accuracy, offering a practical solution to the growing memory demands of robotic mapping.

Abstract

Mapping is crucial in robotics for localization and downstream decision-making. As robots are deployed in ever-broader settings, the maps they rely on continue to increase in size. However, storing these maps indefinitely (cold storage), transferring them across networks, or sending localization queries to cloud-hosted maps imposes prohibitive memory and bandwidth costs. We propose a text-enhanced compression framework that reduces both memory and bandwidth footprints while retaining high-fidelity localization. The key idea is to treat text as an alternative modality: one that can be losslessly compressed with large language models. We propose leveraging lightweight text descriptions combined with very small image feature vectors, which capture "complementary information" as a compact representation for the mapping task. Building on this, our novel technique, Similarity Space Replication (SSR), learns an adaptive image embedding in one shot that captures only the information "complementary" to the text descriptions. We validate our compression framework on multiple downstream localization tasks, including Visual Place Recognition as well as object-centric Monte Carlo localization in both indoor and outdoor settings. SSR achieves 2 times better compression than competing baselines on state-of-the-art datasets, including TokyoVal, Pittsburgh30k, Replica, and KITTI.

Inference & Quantization Natural Language Processing Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SSR: A Generic Framework for Text-Aided Map Compression for Localization

Related Papers