Search papers, labs, and topics across Lattice.
This paper introduces KPGrasp, a novel flow-matching framework for generating dexterous grasps by leveraging large-scale data instead of traditional contact losses or test-time refinements. By utilizing an all-Euclidean 3D hand-keypoint parameterization and a scalable Transformer flow model, KPGrasp achieves state-of-the-art performance on simulation benchmarks, including a 76.3% grasp success rate on the Dexonomy benchmark鈥攁n improvement of 47.4% over the closest baseline. Additionally, the model demonstrates real-world applicability by successfully grasping 20 diverse objects with minimal inference time of 0.032 seconds per grasp.
KPGrasp achieves a remarkable 76.3% grasp success rate, outpacing existing methods by a staggering 47.4% while simplifying the grasp generation process.
Generating high-quality dexterous grasps remains challenging for learning-based methods, which often depend on carefully tuned contact losses or costly contact-based test-time refinement. We present KPGrasp, a flow-matching framework that learns dexterous grasp priors from large-scale data rather than relying on contact losses or contact-based test-time refinement. KPGrasp couples an all-Euclidean 3D hand-keypoint parameterization with a simple yet scalable Transformer flow model. The parameterization avoids the drawbacks of the conventional mixed SE(3) pose and joint-angle output space, expresses grasps in the same frame as the object point cloud, and thus enables native spatial reasoning; the Transformer flow model is trained with only the standard flow-matching loss and scales effectively with data, model capacity, and batch size. Experiments demonstrate state-of-the-art performance on two simulation benchmarks. On the Dexonomy benchmark, it reaches a 76.3% grasp success rate, improving over the strongest directly comparable baseline by 47.4% while reducing penetration depth to 2.4 mm. The same model also achieves the best average performance on the DexGrasp Anything benchmark without fine-tuning. For batched inference, KPGrasp requires only 0.032 s per grasp. Finally, real-world experiments on 20 diverse objects demonstrate that the pipeline can be deployed in a real-world setup.