Mar 16, 2026arXiv:2603.15359

NavThinker: Action-Conditioned World Models for Coupled Prediction and Planning in Social Navigation

Tianshuai Hu, Zeying Gong, Lingdong Kong, XiaoDong Mei, Yiyi Ding, Qi Zeng, Ao Liang, Rong Li, Yangyi Zhong, Junwei Liang

AI Summary

NavThinker, a novel framework, addresses the coupled prediction-planning challenge in social navigation by integrating an action-conditioned world model with on-policy reinforcement learning. The world model predicts future scene geometry and human motion in Depth Anything V2 patch feature space, enabling the policy, trained with DD-PPO, to incorporate future-aware signals through fused features and social reward shaping. Experimental results demonstrate state-of-the-art navigation success in simulated environments, with successful zero-shot transfer and real-world deployment, highlighting the framework's generalization and practical applicability.

Key Contribution

Robots can now navigate crowded spaces with human-like foresight, thanks to a new world model that anticipates how people will react to the robot's actions.

Abstract

Social navigation requires robots to act safely in dynamic human environments. Effective behavior demands thinking ahead: reasoning about how the scene and pedestrians evolve under different robot actions rather than reacting to current observations alone. This creates a coupled prediction-planning challenge, where robot actions and human motion mutually influence each other. To address this challenge, we propose NavThinker, a future-aware framework that couples an action-conditioned world model with on-policy reinforcement learning. The world model operates in the Depth Anything V2 patch feature space and performs autoregressive prediction of future scene geometry and human motion; multi-head decoders then produce future depth maps and human trajectories, yielding a future-aware state aligned with traversability and interaction risk. Crucially, we train the policy with DD-PPO while injecting world-model think-ahead signals via: (i) action-conditioned future features fused into the current observation embedding and (ii) social reward shaping from predicted human trajectories. Experiments on single- and multi-robot Social-HM3D show state-of-the-art navigation success, with zero-shot transfer to Social-MP3D and real-world deployment on a Unitree Go2, validating generalization and practical applicability. Webpage: https://github.com/hutslib/NavThinker.

Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

NavThinker: Action-Conditioned World Models for Coupled Prediction and Planning in Social Navigation

Related Papers