Search papers, labs, and topics across Lattice.
This paper introduces Similarity Space Replication (SSR), a text-enhanced map compression framework that leverages lightweight text descriptions and small image feature vectors to reduce memory and bandwidth costs for robot localization. SSR learns an adaptive image embedding that captures information complementary to the text descriptions, enabling high-fidelity localization with significantly smaller map sizes. Experiments on datasets like TokyoVal and KITTI demonstrate that SSR achieves 2x better compression compared to existing baselines while maintaining localization performance.
Text, combined with learned image embeddings, can compress maps by 2x while preserving localization accuracy, offering a practical solution to the growing memory demands of robotic mapping.
Mapping is crucial in robotics for localization and downstream decision-making. As robots are deployed in ever-broader settings, the maps they rely on continue to increase in size. However, storing these maps indefinitely (cold storage), transferring them across networks, or sending localization queries to cloud-hosted maps imposes prohibitive memory and bandwidth costs. We propose a text-enhanced compression framework that reduces both memory and bandwidth footprints while retaining high-fidelity localization. The key idea is to treat text as an alternative modality: one that can be losslessly compressed with large language models. We propose leveraging lightweight text descriptions combined with very small image feature vectors, which capture "complementary information" as a compact representation for the mapping task. Building on this, our novel technique, Similarity Space Replication (SSR), learns an adaptive image embedding in one shot that captures only the information "complementary" to the text descriptions. We validate our compression framework on multiple downstream localization tasks, including Visual Place Recognition as well as object-centric Monte Carlo localization in both indoor and outdoor settings. SSR achieves 2 times better compression than competing baselines on state-of-the-art datasets, including TokyoVal, Pittsburgh30k, Replica, and KITTI.