Search papers, labs, and topics across Lattice.
The paper introduces WildGHand, a novel optimization-based framework for reconstructing high-fidelity 3D hand avatars from monocular in-the-wild videos that are subject to significant perturbations. WildGHand disentangles perturbations by representing them as time-varying biases on 3D Gaussian attributes and employs a perturbation-aware optimization strategy using anisotropic weighted masks. Experiments on a newly curated dataset and existing public datasets demonstrate state-of-the-art performance, with significant improvements in PSNR and LPIPS compared to existing methods.
Reconstructing realistic 3D hand avatars from messy, real-world video just got a whole lot better thanks to a new method that explicitly models and suppresses visual "noise" like motion blur and object interactions.
Despite recent progress in 3D hand reconstruction from monocular videos, most existing methods rely on data captured in well-controlled environments and therefore degrade in real-world settings with severe perturbations, such as hand-object interactions, extreme poses, illumination changes, and motion blur. To tackle these issues, we introduce WildGHand, an optimization-based framework that enables self-adaptive 3D Gaussian splatting on in-the-wild videos and produces high-fidelity hand avatars. WildGHand incorporates two key components: (i) a dynamic perturbation disentanglement module that explicitly represents perturbations as time-varying biases on 3D Gaussian attributes during optimization, and (ii) a perturbation-aware optimization strategy that generates per-frame anisotropic weighted masks to guide optimization. Together, these components allow the framework to identify and suppress perturbations across both spatial and temporal dimensions. We further curate a dataset of monocular hand videos captured under diverse perturbations to benchmark in-the-wild hand avatar reconstruction. Extensive experiments on this dataset and two public datasets demonstrate that WildGHand achieves state-of-the-art performance and substantially improves over its base model across multiple metrics (e.g., up to a $15.8\%$ relative gain in PSNR and a $23.1\%$ relative reduction in LPIPS). Our implementation and dataset are available at https://github.com/XuanHuang0/WildGHand.