Apr 14, 2026arXiv:2604.12933

DINO-Explorer: Active Underwater Discovery via Ego-Motion Compensated Semantic Predictive Coding

Yuhan Jin, Nayari Marie Lessa, Mariela De Lucas Alvarez, Melvin Laux, L. A. Barbosa, Lucas Amparo Barbosa, Frank Kirchner, Rebecca Adam

AI Summary

DINO-Explorer is introduced as a novelty-aware perception framework for autonomous underwater vehicles (AUVs) that uses a continuous semantic surprise signal in the latent space of a frozen DINOv3 model. It employs an action-conditioned recurrent predictor to anticipate semantic evolution and discounts self-induced visual changes using an efference-copy-inspired module based on optical flow. Experiments demonstrate that DINO-Explorer effectively surfaces mission-relevant phenomena, retaining 78.8% of human-reviewer consensus events with a 56.8% trigger confirmation rate, and reduces telemetry bandwidth by 48.2% while maintaining a 62.2% peak F1 score.

Key Contribution

By predicting semantic changes in underwater video and filtering out ego-motion, DINO-Explorer lets autonomous vehicles focus on real discoveries, not just where they've been.

Abstract

Marine ecosystem degradation necessitates continuous, scientifically selective underwater monitoring. However, most autonomous underwater vehicles (AUVs) operate as passive data loggers, capturing exhaustive video for offline review and frequently missing transient events of high scientific value. Transitioning to active perception requires a causal, online signal that highlights significant phenomena while suppressing maneuver-induced visual changes. We propose DINO-Explorer, a novelty-aware perception framework driven by a continuous semantic surprise signal. Operating within the latent space of a frozen DINOv3 foundation model, it leverages a lightweight, action-conditioned recurrent predictor to anticipate short-horizon semantic evolution. An efference-copy-inspired module utilizes globally pooled optical flow to discount self-induced visual changes without suppressing genuine environmental novelty. We evaluate this signal on the downstream task of asynchronous event triage under variant telemetry constraints. Results demonstrate that DINO-Explorer provides a robust, bandwidth-efficient attention mechanism. At a fixed operating point, the system retains 78.8% of post-discovery human-reviewer consensus events with a 56.8% trigger confirmation rate, effectively surfacing mission-relevant phenomena. Crucially, ego-motion conditioning suppresses 45.5% of false positives relative to an uncompensated surprise signal baseline. In a replay-side Pareto ablation study, DINO-Explorer robustly dominates the validated peak F1 versus telemetry bandwidth frontier, reducing telemetry bandwidth by 48.2% at the selected operating point while maintaining a 62.2% peak F1 score, successfully concentrating data transmission around human-verified novelty events.

Computer Vision Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References24

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DINO-Explorer: Active Underwater Discovery via Ego-Motion Compensated Semantic Predictive Coding

Related Papers