FudanApr 5, 2026arXiv:2604.04055

DINO-VO: Learning Where to Focus for Enhanced State Estimation

Sijia Hu, Xin Gao, Junpeng Ma, Xiangyang Xue, Jian Pu

AI Summary

DINO-VO, a monocular visual odometry system, learns to select informative image patches via a differentiable adaptive patch selector, improving generalization across diverse scenes. It integrates a multi-task feature extraction module with a differentiable bundle adjustment module that leverages inverse depth priors, effectively combining appearance and geometric information. Experiments across TartanAir, KITTI, Euroc, and TUM datasets show DINO-VO achieves state-of-the-art tracking accuracy and strong generalization.

Key Contribution

DINO-VO's learned patch selection and differentiable bundle adjustment leapfrogs traditional heuristic feature extraction, achieving SOTA monocular visual odometry with impressive generalization.

Abstract

We present DINO Patch Visual Odometry (DINO-VO), an end-to-end monocular visual odometry system with strong scene generalization. Current Visual Odometry (VO) systems often rely on heuristic feature extraction strategies, which can degrade accuracy and robustness, particularly in large-scale outdoor environments. DINO-VO addresses these limitations by incorporating a differentiable adaptive patch selector into the end-to-end pipeline, improving the quality of extracted patches and enhancing generalization across diverse datasets. Additionally, our system integrates a multi-task feature extraction module with a differentiable bundle adjustment (BA) module that leverages inverse depth priors, enabling the system to learn and utilize appearance and geometric information effectively. This integration bridges the gap between feature learning and state estimation. Extensive experiments on the TartanAir, KITTI, Euroc, and TUM datasets demonstrate that DINO-VO exhibits strong generalization across synthetic, indoor, and outdoor environments, achieving state-of-the-art tracking accuracy.

Computer Vision Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DINO-VO: Learning Where to Focus for Enhanced State Estimation

Related Papers