Jun 8, 2026arXiv:2606.09314

KPGrasp: Scalable Keypoint Flow Matching for Dexterous Grasp Generation

Yuansen Huang, Jiayi Chen, Haoran Liu, Yubin Ke, Bing Han, Jiangran Lyu, Mi Yan, Li Yi, He Wang

AI Summary

This paper introduces KPGrasp, a novel flow-matching framework for generating dexterous grasps by leveraging large-scale data instead of traditional contact losses or test-time refinements. By utilizing an all-Euclidean 3D hand-keypoint parameterization and a scalable Transformer flow model, KPGrasp achieves state-of-the-art performance on simulation benchmarks, including a 76.3% grasp success rate on the Dexonomy benchmark—an improvement of 47.4% over the closest baseline. Additionally, the model demonstrates real-world applicability by successfully grasping 20 diverse objects with minimal inference time of 0.032 seconds per grasp.

Key Contribution

KPGrasp achieves a remarkable 76.3% grasp success rate, outpacing existing methods by a staggering 47.4% while simplifying the grasp generation process.

Abstract

Generating high-quality dexterous grasps remains challenging for learning-based methods, which often depend on carefully tuned contact losses or costly contact-based test-time refinement. We present KPGrasp, a flow-matching framework that learns dexterous grasp priors from large-scale data rather than relying on contact losses or contact-based test-time refinement. KPGrasp couples an all-Euclidean 3D hand-keypoint parameterization with a simple yet scalable Transformer flow model. The parameterization avoids the drawbacks of the conventional mixed SE(3) pose and joint-angle output space, expresses grasps in the same frame as the object point cloud, and thus enables native spatial reasoning; the Transformer flow model is trained with only the standard flow-matching loss and scales effectively with data, model capacity, and batch size. Experiments demonstrate state-of-the-art performance on two simulation benchmarks. On the Dexonomy benchmark, it reaches a 76.3% grasp success rate, improving over the strongest directly comparable baseline by 47.4% while reducing penetration depth to 2.4 mm. The same model also achieves the best average performance on the DexGrasp Anything benchmark without fine-tuning. For batched inference, KPGrasp requires only 0.032 s per grasp. Finally, real-world experiments on 20 diverse objects demonstrate that the pipeline can be deployed in a real-world setup.

Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

KPGrasp: Scalable Keypoint Flow Matching for Dexterous Grasp Generation

Related Papers