Search papers, labs, and topics across Lattice.
HARBOR is a novel framework designed to automate the reinforcement learning (RL) workflow for robotic tasks, addressing the significant engineering challenges that hinder broader adoption of RL in sim-to-real applications. By framing the automation as a harness-engineering problem, HARBOR decomposes high-level objectives into manageable stages executed by specialized agents, enabling efficient environment setup, policy training, and reward design. The evaluation across 6 benchmarks and 16 tasks demonstrates that HARBOR not only reduces engineering effort and costs but also produces transferable policies that can be effectively deployed on real robots.
Automating the RL workflow with HARBOR cuts engineering costs and enhances policy transferability to real-world robots.
Reinforcement learning (RL) has become a powerful paradigm for robot learning, particularly in sim-to-real settings, but its broader adoption remains limited by the engineering pipeline surrounding the algorithms. Building tasks, shaping rewards, and tuning hyperparameters require substantial expert effort, making RL workflows costly and difficult to scale. We introduce HARBOR, an agentic framework that frames robot RL automation as a harness-engineering problem: given a simulator codebase and a task specification, it automates the workflow from environment setup to policy training in simulation. HARBOR decomposes such high-level objectives into bounded stages executed by specialized agents through standardized commands, persistent artifacts, executable gates, and reusable knowledge, and scales iteration via decentralized parallel trials and experience learning across runs. We evaluate HARBOR across 6 benchmarks and 16 tasks in total, spanning manipulation, locomotion, and bimanual dexterous control. We demonstrate that HARBOR automates the simulation RL workflow end-to-end, designs rewards, tunes algorithms to match or improve over default configurations, and reduces engineering effort at practical token and wall-clock cost; the resulting policies can also be transferred to real robots.