Search papers, labs, and topics across Lattice.
This paper introduces CMoE, a reinforcement learning framework that uses contrastive learning to improve expert specialization in Mixture of Experts (MoE) for humanoid robot motion control. CMoE encourages experts to specialize in distinct terrain types by maximizing the consistency of expert activations within the same terrain and minimizing their similarity across different terrains. Experiments on a Unitree G1 robot demonstrate that CMoE enables the robot to traverse challenging terrains with continuous steps and gaps, achieving robust and natural gait across diverse mixed terrains, outperforming existing methods.
Humanoid robots can now traverse steps up to 20cm and gaps up to 80cm thanks to a contrastive learning approach that forces MoE experts to specialize on different terrain types.
For effective deployment in real-world environments, humanoid robots must autonomously navigate a diverse range of complex terrains with abrupt transitions. While the Vanilla mixture of experts (MoE) framework is theoretically capable of modeling diverse terrain features, in practice, the gating network exhibits nearly uniform expert activations across different terrains, weakening the expert specialization and limiting the model's expressive power. To address this limitation, we introduce CMoE, a novel single-stage reinforcement learning framework that integrates contrastive learning to refine expert activation distributions. By imposing contrastive constraints, CMoE maximizes the consistency of expert activations within the same terrain while minimizing their similarity across different terrains, thereby encouraging experts to specialize in distinct terrain types. We validated our approach on the Unitree G1 humanoid robot through a series of challenging experiments. Results demonstrate that CMoE enables the robot to traverse continuous steps up to 20 cm high and gaps up to 80 cm wide, while achieving robust and natural gait across diverse mixed terrains, surpassing the limits of existing methods. To support further research and foster community development, we release our code publicly.