Search papers, labs, and topics across Lattice.
The paper introduces Part-Aware 3D Feature Field (PA3FF), a novel dense 3D feature representation trained with contrastive learning on 3D part proposals to encode functional part information for articulated objects. PA3FF addresses limitations of prior 2D-based foundation features by directly predicting a continuous 3D feature field from point clouds, enabling efficient and geometrically-aware reasoning. The proposed Part-Aware Diffusion Policy (PADP), built upon PA3FF, demonstrates improved sample efficiency and generalization in simulated and real-world articulated object manipulation tasks compared to existing 2D and 3D representations.
Forget 2D features – PA3FF learns a 3D representation of articulated objects that understands functional parts, leading to better robot manipulation.
Articulated object manipulation is essential for various real-world robotic tasks, yet generalizing across diverse objects remains a major challenge. A key to generalization lies in understanding functional parts (e.g., door handles and knobs), which indicate where and how to manipulate across diverse object categories and shapes. Previous works attempted to achieve generalization by introducing foundation features, while these features are mostly 2D-based and do not specifically consider functional parts. When lifting these 2D features to geometry-profound 3D space, challenges arise, such as long runtimes, multi-view inconsistencies, and low spatial resolution with insufficient geometric information. To address these issues, we propose Part-Aware 3D Feature Field (PA3FF), a novel dense 3D feature with part awareness for generalizable articulated object manipulation. PA3FF is trained by 3D part proposals from a large-scale labeled dataset, via a contrastive learning formulation. Given point clouds as input, PA3FF predicts a continuous 3D feature field in a feedforward manner, where the distance between point features reflects the proximity of functional parts: points with similar features are more likely to belong to the same part. Building on this feature, we introduce the Part-Aware Diffusion Policy (PADP), an imitation learning framework aimed at enhancing sample efficiency and generalization for robotic manipulation. We evaluate PADP on several simulated and real-world tasks, demonstrating that PA3FF consistently outperforms a range of 2D and 3D representations in manipulation scenarios, including CLIP, DINOv2, and Grounded-SAM. Beyond imitation learning, PA3FF enables diverse downstream methods, including correspondence learning and segmentation tasks, making it a versatile foundation for robotic manipulation. Project page: https://pa3ff.github.io