Search papers, labs, and topics across Lattice.
This paper introduces Robust Policy Gating (RPG), a hybrid expert policy framework, to address the challenge of unstable transitions between skills in humanoid robot fighting. RPG uses motion transition randomization and temporal randomization during training to create a unified policy capable of generating agile fighting actions with smooth and stable skill transitions. The framework integrates locomotion with fighting skills, enabling long-duration combat and seamless policy transitions, validated in simulation and real-world experiments on a Unitree G1 robot.
Humanoid robots can now seamlessly transition between fighting skills thanks to a novel policy gating approach that ensures stability and smoothness.
Humanoid robots have demonstrated impressive motor skills in a wide range of tasks, yet whole-body control for humanlike long-time, dynamic fighting remains particularly challenging due to the stringent requirements on agility and stability. While imitation learning enables robots to execute human-like fighting skills, existing approaches often rely on switching among multiple single-skill policies or employing a general policy to imitate input reference motions. These strategies suffer from instability when transitioning between skills, as the mismatch of initial and terminal states across skills or reference motions introduces out-of-domain disturbances, resulting in unsmooth or unstable behaviors. In this work, we propose RPG, a hybrid expert policy framework, for smooth and stable humanoid multi-skills transition. Our approach incorporates motion transition randomization and temporal randomization to train a unified policy that generates agile fighting actions with stability and smoothness during skill transitions. Furthermore, we design a control pipeline that integrates walking/running locomotion with fighting skills, allowing humanlike long-time combat of arbitrary duration that can be seamlessly interrupted or transit action policies at any time. Extensive experiments in simulation demonstrate the effectiveness of the proposed framework, and real-world deployment on the Unitree G1 humanoid robot further validates its robustness and applicability.