Search papers, labs, and topics across Lattice.
The paper introduces One-Policy-Fits-All (OPFA), a framework for cross-embodiment robot manipulation that learns a single policy applicable to diverse robotic end-effectors. OPFA utilizes a Geometry-Aware Latent Representation (GaLR) to create a shared latent action space via 3D convolutions and transformers, and a unified latent retargeting decoder to extract embodiment-specific actions. Experiments across 11 end-effectors show that OPFA improves policy performance, with cross-embodiment co-training boosting success rates by over 50% compared to single-source training and enabling rapid adaptation to new embodiments with minimal data.
Forget training separate policies for every robot hand – this method learns one policy to control them all, slashing data needs and boosting performance by 50% in cross-embodiment manipulation.
Cross-embodiment manipulation is crucial for enhancing the scalability of robot manipulation and reducing the high cost of data collection. However, the significant differences between embodiments, such as variations in action spaces and structural disparities, pose challenges for joint training across multiple sources of data. To address this, we propose One-Policy-Fits-All (OPFA), a framework that enables learning a single, versatile policy across multiple embodiments. We first learn a Geometry-Aware Latent Representation (GaLR), which leverages 3D convolution networks and transformers to build a shared latent action space across different embodiments. Then we design a unified latent retargeting decoder that extracts embodiment-specific actions from the latent representations, without any embodiment-specific decoder tuning. OPFA enables end-to-end co-training of data from diverse embodiments, including various grippers and dexterous hands with arbitrary degrees of freedom, significantly improving data efficiency and reducing the cost of skill transfer. We conduct extensive experiments across 11 different end-effectors. The results demonstrate that OPFA significantly improves policy performance in diverse settings by leveraging heterogeneous embodiment data. For instance, cross-embodiment co-training can improve success rates by more than 50% compared to single-source training. Moreover, by adding only a few demonstrations from a new embodiment (e.g., eight), OPFA can achieve performance comparable to that of a well-trained model with 72 demonstrations.