Search papers, labs, and topics across Lattice.
The paper introduces ACDC, a novel goal-conditioned reinforcement learning paradigm for robotic manipulation that combines Adaptive Curriculum (AC) Planning and Dynamic Contrastive (DC) Control. AC dynamically adjusts the learning curriculum by balancing diversity-driven exploration and quality-driven exploitation based on the agent's performance, while DC implements this curriculum using norm-constrained contrastive learning for magnitude-guided experience selection. Experiments on robotic manipulation tasks demonstrate that ACDC achieves superior sample efficiency and task success rates compared to state-of-the-art methods.
Forget hand-engineered reward shaping: ACDC learns robotic manipulation tasks faster and more reliably by adaptively balancing exploration and exploitation with a dynamic contrastive loss.
Goal-conditioned reinforcement learning has shown considerable potential in robotic manipulation; however, existing approaches remain limited by their reliance on prioritizing collected experience, resulting in suboptimal performance across diverse tasks. Inspired by human learning behaviors, we propose a more comprehensive learning paradigm, ACDC, which integrates multidimensional Adaptive Curriculum (AC) Planning with Dynamic Contrastive (DC) Control to guide the agent along a well-designed learning trajectory. More specifically, at the planning level, the AC component schedules the learning curriculum by dynamically balancing diversity-driven exploration and quality-driven exploitation based on the agent's success rate and training progress. At the control level, the DC component implements the curriculum plan through norm-constrained contrastive learning, enabling magnitude-guided experience selection aligned with the current curriculum focus. Extensive experiments on challenging robotic manipulation tasks demonstrate that ACDC consistently outperforms the state-of-the-art baselines in both sample efficiency and final task success rate.