CU BoulderUnitreeMar 9, 2026arXiv:2603.08961

FAME: Force-Adaptive RL for Expanding the Manipulation Envelope of a Full-Scale Humanoid

Niraj Pudasaini, Jensen Lavering, Alessandro Roncone, N. Correll

AI Summary

The paper introduces FAME, a force-adaptive reinforcement learning framework that trains a humanoid robot to maintain balance under external hand forces by conditioning a standing policy on a learned latent context encoding upper-body joint configuration and bimanual interaction forces. FAME uses spherically sampled 3D forces during training to simulate disturbances and an upper-body pose curriculum to expose the policy to manipulation-induced perturbations. The learned policy, deployed on a full-scale Unitree H12 humanoid, demonstrates improved standing success in simulation and robustness in real-world load-interaction scenarios without wrist force/torque sensors, estimating interaction forces from robot dynamics.

Key Contribution

Humanoid robots can now maintain balance under complex external forces without force/torque sensors, thanks to a force-adaptive RL policy that learns to anticipate and compensate for disturbances.

Abstract

Maintaining balance under external hand forces is critical for humanoid bimanual manipulation, where interaction forces propagate through the kinematic chain and constrain the feasible manipulation envelope. We propose \textbf{FAME}, a force-adaptive reinforcement learning framework that conditions a standing policy on a learned latent context encoding upper-body joint configuration and bimanual interaction forces. During training, we apply diverse, spherically sampled 3D forces on each hand to inject disturbances in simulation together with an upper-body pose curriculum, exposing the policy to manipulation-induced perturbations across continuously varying arm configurations. At deployment, interaction forces are estimated from the robot dynamics and fed to the same encoder, enabling online adaptation without wrist force/torque sensors. In simulation across five fixed arm configurations with randomized hand forces and commanded base heights, FAME improves mean standing success to 73.84%, compared to 51.40% for the curriculum-only baseline and 29.44% for the base policy. We further deploy the learned policy on a full-scale Unitree H12 humanoid and evaluate robustness in representative load-interaction scenarios, including asymmetric single-arm load and symmetric bimanual load. Code and videos are available on https://fame10.github.io/Fame/

RLHF & Preference Learning Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References31

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

FAME: Force-Adaptive RL for Expanding the Manipulation Envelope of a Full-Scale Humanoid

Related Papers