Search papers, labs, and topics across Lattice.
MonoDuo learns bimanual manipulation policies by collecting single-arm robot demonstrations paired with human collaboration, alternating roles to cover both sides of the task. It then synthesizes bimanual demonstrations using hand pose estimation, segmentation, and inpainting to augment RGB-D observations from wrist-mounted and fixed cameras. Experiments across five tasks show that policies trained with MonoDuo achieve up to 70% success in zero-shot deployment on unseen bimanual robots, and few-shot finetuning boosts performance by another 65-70%.
Unlock bimanual robot skills without bimanual robots: MonoDuo lets you train bimanual policies using only a single-arm robot and a human partner.
Bimanual coordination is essential for many real-world manipulation tasks, yet learning bimanual robot policies is limited by the scarcity of bimanual robots and datasets. Single-arm robots, however, are widely available in research labs. Can we leverage them to train bimanual robot policies? We present MonoDuo, a framework for learning bimanual manipulation policies using single-arm robot demonstrations paired with human collaboration. MonoDuo collects data by teleoperating a single-arm robot to perform one side of a bimanual task while a human performs the other, then swapping roles to cover both sides. RGB-D observations from a wrist-mounted and fixed camera are augmented into synthetic demonstrations for target bimanual robots using state-of-the-art hand pose estimation, image and point cloud segmentation, and inpainting. These synthetic demonstrations, grounded in real robot kinematics, are used to train bimanual policies. We evaluate MonoDuo on five tasks: box lifting, backpack packing, cloth folding, jacket zipping, and plate handover. Compared to approaches relying solely on human bimanual videos, MonoDuo enables zero-shot deployment on unseen bimanual robot configurations, achieving success rates up to 70%. With only 25 target robot demonstrations, few-shot finetuning further boosts success rates by 65-70% over training from scratch, demonstrating MonoDuo's effectiveness in efficiently transferring knowledge from single-arm robot data to bimanual robot policies.