Search papers, labs, and topics across Lattice.
This paper introduces UniDexTok, a novel state tokenizer that creates a unified representation for diverse dexterous hands by mapping human and robot hand states into a shared 22-degree-of-freedom semantic interface. By learning embodiment-conditioned discrete tokens from real joint states without the need for retargeting or simulation data, UniDexTok achieves remarkable accuracy improvements, reducing mean per joint angle error (MPJAE) from 15.63 degrees to 0.16 degrees and mean per joint position error (MPJPE) from 18.51 mm to 0.18 mm. The method also demonstrates enhanced reconstruction capabilities, benefiting from cross-embodiment data and exhibiting strong performance in zero-shot and few-shot scenarios.
UniDexTok slashes reconstruction errors by over 98% for dexterous hands, enabling unprecedented accuracy in fine manipulation tasks.
Dexterous hands are essential for fine-grained manipulation, but their hardware designs vary substantially across embodiments. Differences in kinematics, joint definitions, and degrees of freedom make it difficult to define a shared state representation compared with parallel grippers. As a result, dexterous-hand data remains fragmented and difficult to use for joint training. In this work, we propose the Unified Dexterous Hand Model (UDHM), which maps human and robot hand states into a shared 22-DoF semantic interface. Based on UDHM, we introduce UniDexTok, a retargeting-free state tokenizer that learns embodiment-conditioned discrete tokens from standardized real joint states. UniDexTok provides a unified representation for heterogeneous dexterous hands without relying on retargeting or simulation data. Compared with the recent baseline UniHM, UniDexTok reduces MPJAE from 15.63 degrees to 0.16 degrees and MPJPE from 18.51 mm to 0.18 mm, corresponding to error reductions of 98.98% and 99.03%, respectively. These results improve reconstruction from centimeter-scale to sub-millimeter accuracy. Experiments further show that data from other embodiments improves target-embodiment reconstruction accuracy, demonstrating the benefit of cross-embodiment tokenization. UniDexTok also shows strong zero-shot and few-shot reconstruction ability when new dexterous hands are introduced.