Search papers, labs, and topics across Lattice.
SuperGrasp is introduced, a two-stage framework for single-view parallel-jaw grasping that first retrieves grasp candidates by matching input point clouds to a superquadric primitive dataset, then refines these candidates using an end-to-end network (E-RNet) anchored to the initial grasp closure region. The E-RNet expands the grasp-aware region, enabling more accurate evaluation and refinement. Experiments show SuperGrasp achieves stable grasping and strong generalization in both simulation and real-world settings.
SuperGrasp achieves robust single-view grasping by cleverly combining superquadric-based similarity matching with an end-to-end refinement network, outperforming existing methods in stability and generalization.
Robotic grasping from single-view observations remains a critical challenge in manipulation. Existing methods still struggle to generate stable and valid grasp poses when confronted with incomplete geometric information. To address these limitations, we propose SuperGrasp, a novel two-stage framework for single-view grasping with parallel-jaw grippers that decomposes the grasping process into initial grasp pose generation and subsequent grasp evaluation and refinement. In the first stage, we introduce a Similarity Matching Module that efficiently retrieves grasp candidates by matching the input single-view point cloud with a pre-computed primitive dataset based on superquadric coefficients. In the second stage, we propose E-RNet, an end-to-end network that expands the graspaware region and takes the initial grasp closure region as a local anchor region, enabling more accurate and reliable evaluation and refinement of grasp candidates. To enhance generalization, we construct a primitive dataset containing 1.5k primitives for similarity matching and collect a large-scale point cloud dataset with 100k stable grasp labels from 124 objects for network training. Extensive experiments in both simulation and realworld environments demonstrate that our method achieves stable grasping performance and strong generalization across varying scenes and novel objects.