Search papers, labs, and topics across Lattice.
NavThinker, a novel framework, addresses the coupled prediction-planning challenge in social navigation by integrating an action-conditioned world model with on-policy reinforcement learning. The world model predicts future scene geometry and human motion in Depth Anything V2 patch feature space, enabling the policy, trained with DD-PPO, to incorporate future-aware signals through fused features and social reward shaping. Experimental results demonstrate state-of-the-art navigation success in simulated environments, with successful zero-shot transfer and real-world deployment, highlighting the framework's generalization and practical applicability.
Robots can now navigate crowded spaces with human-like foresight, thanks to a new world model that anticipates how people will react to the robot's actions.
Social navigation requires robots to act safely in dynamic human environments. Effective behavior demands thinking ahead: reasoning about how the scene and pedestrians evolve under different robot actions rather than reacting to current observations alone. This creates a coupled prediction-planning challenge, where robot actions and human motion mutually influence each other. To address this challenge, we propose NavThinker, a future-aware framework that couples an action-conditioned world model with on-policy reinforcement learning. The world model operates in the Depth Anything V2 patch feature space and performs autoregressive prediction of future scene geometry and human motion; multi-head decoders then produce future depth maps and human trajectories, yielding a future-aware state aligned with traversability and interaction risk. Crucially, we train the policy with DD-PPO while injecting world-model think-ahead signals via: (i) action-conditioned future features fused into the current observation embedding and (ii) social reward shaping from predicted human trajectories. Experiments on single- and multi-robot Social-HM3D show state-of-the-art navigation success, with zero-shot transfer to Social-MP3D and real-world deployment on a Unitree Go2, validating generalization and practical applicability. Webpage: https://github.com/hutslib/NavThinker.