Search papers, labs, and topics across Lattice.
The paper introduces UltraDexGrasp, a framework for learning universal dexterous grasping with bimanual robots by generating a large-scale synthetic dataset of 20 million frames across 1,000 objects using optimization-based grasp synthesis and planning-based demonstration generation. A grasp policy trained on this dataset, using point clouds as input and unidirectional attention for feature aggregation, achieves robust zero-shot sim-to-real transfer. The resulting policy demonstrates an average success rate of 81.2% in real-world universal dexterous grasping on novel objects.
Bimanual robots can now achieve robust dexterous grasping in the real world, thanks to a massive 20M-frame synthetic dataset and a simple attention-based policy that transfers surprisingly well.
Grasping is a fundamental capability for robots to interact with the physical world. Humans, equipped with two hands, autonomously select appropriate grasp strategies based on the shape, size, and weight of objects, enabling robust grasping and subsequent manipulation. In contrast, current robotic grasping remains limited, particularly in multi-strategy settings. Although substantial efforts have targeted parallel-gripper and single-hand grasping, dexterous grasping for bimanual robots remains underexplored, with data being a primary bottleneck. Achieving physically plausible and geometrically conforming grasps that can withstand external wrenches poses significant challenges. To address these issues, we introduce UltraDexGrasp, a framework for universal dexterous grasping with bimanual robots. The proposed data-generation pipeline integrates optimization-based grasp synthesis with planning-based demonstration generation, yielding high-quality and diverse trajectories across multiple grasp strategies. With this framework, we curate UltraDexGrasp-20M, a large-scale, multi-strategy grasp dataset comprising 20 million frames across 1,000 objects. Based on UltraDexGrasp-20M, we further develop a simple yet effective grasp policy that takes point clouds as input, aggregates scene features via unidirectional attention, and predicts control commands. Trained exclusively on synthetic data, the policy achieves robust zero-shot sim-to-real transfer and consistently succeeds on novel objects with varied shapes, sizes, and weights, attaining an average success rate of 81.2% in real-world universal dexterous grasping. To facilitate future research on grasping with bimanual robots, we open-source the data generation pipeline at https://github.com/InternRobotics/UltraDexGrasp.