Search papers, labs, and topics across Lattice.
This paper addresses the representation gap between rigid and soft gripper grasp synthesis by learning a mapping from rigid gripper poses to soft Fin-ray gripper poses using Conditional Flow Matching (CFM). They collect paired rigid-soft grasp pose data and train a U-Net conditioned CFM model on object geometry from depth images to learn a continuous transformation. Experimental results on a 7-DOF robot demonstrate that CFM-generated poses significantly outperform baseline rigid poses in grasp success rates for both seen and unseen objects, especially for cylindrical and spherical shapes.
Soft robots can now learn to grasp objects more effectively by translating rigid-gripper grasps into successful soft-gripper grasps using a conditional generative model.
A representation gap exists between grasp synthesis for rigid and soft grippers. Anygrasp [1] and many other grasp synthesis methods are designed for rigid parallel grippers, and adapting them to soft grippers often fails to capture their unique compliant behaviors, resulting in data-intensive and inaccurate models. To bridge this gap, this paper proposes a novel framework to map grasp poses from a rigid gripper model to a soft Fin-ray gripper. We utilize Conditional Flow Matching (CFM), a generative model, to learn this complex transformation. Our methodology includes a data collection pipeline to generate paired rigid-soft grasp poses. A U-Net autoencoder conditions the CFM model on the object's geometry from a depth image, allowing it to learn a continuous mapping from an initial Anygrasp pose to a stable Fin-ray gripper pose. We validate our approach on a 7-DOF robot, demonstrating that our CFM-generated poses achieve a higher overall success rate for seen and unseen objects (34% and 46% respectively) compared to the baseline rigid poses (6% and 25% respectively) when executed by the soft gripper. The model shows significant improvements, particularly for cylindrical (50% and 100% success for seen and unseen objects) and spherical objects (25% and 31% success for seen and unseen objects), and successfully generalizes to unseen objects. This work presents CFM as a data-efficient and effective method for transferring grasp strategies, offering a scalable methodology for other soft robotic systems.