Search papers, labs, and topics across Lattice.
Dream-SLAM is introduced to address limitations in active SLAM, particularly in dynamic environments, by generating cross-spatio-temporal images and semantically plausible structures of partially observed scenes. These dreamed images are fused with real observations to improve camera pose estimation and 3D scene representation. By integrating dreamed and observed scene structures, Dream-SLAM enables long-horizon planning, resulting in more efficient and thorough exploration, demonstrated through experiments on public and self-collected datasets.
By "dreaming" plausible scene completions, Dream-SLAM enables robots to navigate dynamic environments more effectively, achieving better localization, mapping, and exploration than existing methods.
In addition to the core tasks of simultaneous localization and mapping (SLAM), active SLAM additionally in- volves generating robot actions that enable effective and efficient exploration of unknown environments. However, existing active SLAM pipelines are limited by three main factors. First, they inherit the restrictions of the underlying SLAM modules that they may be using. Second, their motion planning strategies are typically shortsighted and lack long-term vision. Third, most approaches struggle to handle dynamic scenes. To address these limitations, we propose a novel monocular active SLAM method, Dream-SLAM, which is based on dreaming cross-spatio-temporal images and semantically plausible structures of partially observed dynamic environments. The generated cross-spatio-temporal im- ages are fused with real observations to mitigate noise and data incompleteness, leading to more accurate camera pose estimation and a more coherent 3D scene representation. Furthermore, we integrate dreamed and observed scene structures to enable long- horizon planning, producing farsighted trajectories that promote efficient and thorough exploration. Extensive experiments on both public and self-collected datasets demonstrate that Dream-SLAM outperforms state-of-the-art methods in localization accuracy, mapping quality, and exploration efficiency. Source code will be publicly available upon paper acceptance.