Mar 30, 2026arXiv:2603.28760

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Patrick Rim, P. Rim, Kevin Harris, Kevin Harris, Braden Copple, Braden Copple, Shangchen Han, Xu Xie, Xuenjie Xie, Ivan Shugurov, Ivan Shugurov, Sizhe An, Sizhe An, He Wen, Alex Wong, Alex Wong, Tomas Hodan, Tomas Hodan, Kun He, Kun He

AI Summary

The authors introduce SHOW3D, a novel marker-less multi-camera system for capturing hand-object interactions in diverse, unconstrained real-world environments. This system uses a back-mounted multi-camera rig synchronized with a VR headset to enable ego-exo tracking for precise 3D annotation of hands and objects. The resulting large-scale dataset, SHOW3D, bridges the gap between environmental realism and annotation accuracy, demonstrating improved generalization in downstream tasks compared to models trained on studio-captured data.

Key Contribution

Training data no longer needs to choose between realism and accuracy: SHOW3D delivers both for hand-object interaction.

Abstract

Accurate 3D understanding of human hands and objects during manipulation remains a significant challenge for egocentric computer vision. Existing hand-object interaction datasets are predominantly captured in controlled studio settings, which limits both environmental diversity and the ability of models trained on such data to generalize to real-world scenarios. To address this challenge, we introduce a novel marker-less multi-camera system that allows for nearly unconstrained mobility in genuinely in-the-wild conditions, while still having the ability to generate precise 3D annotations of hands and objects. The capture system consists of a lightweight, back-mounted, multi-camera rig that is synchronized and calibrated with a user-worn VR headset. For 3D ground-truth annotation of hands and objects, we develop an ego-exo tracking pipeline and rigorously evaluate its quality. Finally, we present SHOW3D, the first large-scale dataset with 3D annotations that show hands interacting with objects in diverse real-world environments, including outdoor settings. Our approach significantly reduces the fundamental trade-off between environmental realism and accuracy of 3D annotations, which we validate with experiments on several downstream tasks. show3d-dataset.github.io

Computer Vision Data Curation & Synthetic Data Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References40

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Related Papers