UniversityMar 18, 2026arXiv:2603.17396

Gesture-Aware Pretraining and Token Fusion for 3D Hand Pose Estimation

AI Summary

This paper introduces a two-stage framework for 3D hand pose estimation from monocular RGB images, leveraging gesture semantics as an inductive bias. The approach involves gesture-aware pretraining using coarse and fine gesture labels from InterHand2.6M to learn an informative embedding space. This is followed by a per-joint token Transformer, guided by gesture embeddings, for regressing MANO hand parameters, resulting in improved single-hand accuracy over the EANet baseline on InterHand2.6M.

Key Contribution

Gesture-aware pretraining unlocks significant improvements in 3D hand pose estimation, proving that semantic gesture information acts as a powerful inductive bias.

Abstract

Estimating 3D hand pose from monocular RGB images is fundamental for applications in AR/VR, human-computer interaction, and sign language understanding. In this work we focus on a scenario where a discrete set of gesture labels is available and show that gesture semantics can serve as a powerful inductive bias for 3D pose estimation. We present a two-stage framework: gesture-aware pretraining that learns an informative embedding space using coarse and fine gesture labels from InterHand2.6M, followed by a per-joint token Transformer guided by gesture embeddings as intermediate representations for final regression of MANO hand parameters. Training is driven by a layered objective over parameters, joints, and structural constraints. Experiments on InterHand2.6M demonstrate that gesture-aware pretraining consistently improves single-hand accuracy over the state-of-the-art EANet baseline, and that the benefit transfers across architectures without any modification.

Computer Vision Multimodal Models Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Gesture-Aware Pretraining and Token Fusion for 3D Hand Pose Estimation

Related Papers