School of InformaticsSJTUMar 12, 2026arXiv:2603.11558

RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks

Ruiying Li, Yunlang Zhou, Yuyao Zhu, Kylin Chen, Sukai Wang, K. Hu, Kongtao Hu, Minhui Yu, Min-ji Yu, Bowen Jiang, Zhan Su, Zhanqi Su, Jiayao Ma, Xin He, Yongjian Shen, Yangyang, Guanghui Ren, Maoqing Yao, Wenhao Wang, Yao Mu

AI Summary

RoboClaw is introduced as an agentic robotics framework that unifies data collection, policy learning, and task execution under a single VLM-driven controller for long-horizon tasks. The core innovation is Entangled Action Pairs (EAP), which couple forward manipulation behaviors with inverse recovery actions, creating self-resetting loops for continuous on-policy data acquisition and iterative policy refinement. Real-world experiments demonstrate a 25% improvement in success rate on long-horizon tasks and a 53.7% reduction in human time investment compared to conventional open-loop pipelines.

Key Contribution

Forget brittle multi-policy execution and manual resets: RoboClaw's "Entangled Action Pairs" let robots self-correct and learn continuously, slashing human intervention by over 50% while boosting task success.

Abstract

Vision-Language-Action (VLA) systems have shown strong potential for language-driven robotic manipulation. However, scaling them to long-horizon tasks remains challenging. Existing pipelines typically separate data collection, policy learning, and deployment, resulting in heavy reliance on manual environment resets and brittle multi-policy execution. We present RoboClaw, an agentic robotics framework that unifies data collection, policy learning, and task execution under a single VLM-driven controller. At the policy level, RoboClaw introduces Entangled Action Pairs (EAP), which couple forward manipulation behaviors with inverse recovery actions to form self-resetting loops for autonomous data collection. This mechanism enables continuous on-policy data acquisition and iterative policy refinement with minimal human intervention. During deployment, the same agent performs high-level reasoning and dynamically orchestrates learned policy primitives to accomplish long-horizon tasks. By maintaining consistent contextual semantics across collection and execution, RoboClaw reduces mismatch between the two phases and improves multi-policy robustness. Experiments in real-world manipulation tasks demonstrate improved stability and scalability compared to conventional open-loop pipelines, while significantly reducing human effort throughout the robot lifecycle, achieving a 25% improvement in success rate over baseline methods on long-horizon tasks and reducing human time investment by 53.7%.

Multimodal Models Robotics & Embodied AI Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References28

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

RoboClaw: An Agentic Framework for Scalable Long-Horizon Robotic Tasks

Related Papers