CASGalbotPKUSJTUMar 10, 2026arXiv:2603.09882

Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

Yixin Zheng, Jiangran Lyu, Yifan Zhang, Jiayi Chen, Mi Yan, Yuntian Deng, Xuesong Shi, Xiaoguang Zhao, Yizhou Wang, Zhizheng Zhang, He Wang

AI Summary

This paper introduces Dynamics-Aware Policy Learning (DAPL), a framework that learns contact-induced object dynamics in cluttered environments to facilitate reinforcement learning for extrinsic dexterity. DAPL uses explicit world modeling to learn a representation of these dynamics, which then conditions the RL policy. Experiments demonstrate that DAPL outperforms prehensile manipulation, human teleoperation, and prior representation-based policies by over 25% in simulation and achieves a 50% success rate in real-world cluttered scenes, showcasing robust sim-to-real transfer.

Key Contribution

Forget hand-crafted heuristics: this new dynamics-aware policy learns to exploit contact forces in cluttered environments, outperforming traditional methods by 25% in simulation and showing impressive sim-to-real transfer.

Abstract

Extrinsic dexterity leverages environmental contact to overcome the limitations of prehensile manipulation. However, achieving such dexterity in cluttered scenes remains challenging and underexplored, as it requires selectively exploiting contact among multiple interacting objects with inherently coupled dynamics. Existing approaches lack explicit modeling of such complex dynamics and therefore fall short in non-prehensile manipulation in cluttered environments, which in turn limits their practical applicability in real-world environments. In this paper, we introduce a Dynamics-Aware Policy Learning (DAPL) framework that can facilitate policy learning with a learned representation of contact-induced object dynamics in cluttered environments. This representation is learned through explicit world modeling and used to condition reinforcement learning, enabling extrinsic dexterity to emerge without hand-crafted contact heuristics or complex reward shaping. We evaluate our approach in both simulation and the real world. Our method outperforms prehensile manipulation, human teleoperation, and prior representation-based policies by over 25% in success rate on unseen simulated cluttered scenes with varying densities. The real-world success rate reaches around 50% across 10 cluttered scenes, while a practical grocery deployment further demonstrates robust sim-to-real transfer and applicability.

Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Emerging Extrinsic Dexterity in Cluttered Scenes via Dynamics-aware Policy Learning

Related Papers