BUPTD visual featuresThe Luoteng Hangzhou Techonlogy CompanyMar 5, 2026

Learning Goal-Directed Rolling: Spherical Robot Point-to-Point Control Through Reinforcement Learning

Junjie An, Runhua Zhang, Yifan Liu, Xiaoqing Guan, You Wang, Jie Hao, Guang Li

AI Summary

This paper introduces an end-to-end reinforcement learning controller for point-to-point navigation of spherical robots, directly mapping proprioceptive inputs to motor commands. To address the challenges of spherical robot control, the authors designed specialized reward functions, a long history encoder, and curriculum learning. The resulting policy achieves 88.87% success in simulation and, when combined with a novel MC-CMA-ES system identification method for sim-to-real transfer, demonstrates high stability and efficiency in real-world experiments.

Key Contribution

Ditch the planner-tracker hierarchy: RL can directly control spherical robots for efficient point-to-point navigation, even transferring from sim-to-real with high stability.

Abstract

Point-to-point navigation is an important ability for spherical robots. Traditional methods usually use a planner and a tracker for short-range target control. However, this hierarchical method suffers from a mismatch issue. In this work, we propose an end-to-end controller based on reinforcement learning. The proposed approach is designed for waypoint tracking after a path has been planned. Taking proprioceptive information such as the robot’s position and orientation as input, our controller directly outputs motor commands to control the spherical robot. To adapt to the unique characteristics of a spherical robot, we have designed various reward functions, a long history encoder, and curriculum learning. We demonstrate that our policy can execute point-to-point tasks with high efficiency stability and adaptability to uncertain environments, achieving a success rate of 88.87% in simulation. To transfer the policy trained in simulation to the real world, we developed a MC-CMA-ES method for system identification to accurately model the simulator’s parameters. This process significantly narrows the gap between simulation and reality, enabling our policy to achieve high stability and efficiency in real-world scenarios.

RLHF & Preference Learning Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Citation Metrics

Citations0

Influential citations0

References33

Year2026

VenueIEEE Robotics and Automation Letters

Related Papers

Finding related papers...

Search

Learning Goal-Directed Rolling: Spherical Robot Point-to-Point Control Through Reinforcement Learning

Related Papers