NVIDIASJTUApr 9, 2026arXiv:2604.07986

DP-DeGauss: Dynamic Probabilistic Gaussian Decomposition for Egocentric 4D Scene Reconstruction

Tingxi Chen, Ting-Hsuan Chen, Zhengxue Cheng, Houqiang Zhong, Suhui Wang, Su Wang, Rong Xie, Li Song

AI Summary

DP-DeGauss is introduced, a novel dynamic probabilistic Gaussian decomposition framework for reconstructing dynamic egocentric 4D scenes by disentangling background, hand, and object components. The method initializes 3D Gaussians from COLMAP priors, assigns category probabilities, and routes them to specialized deformation branches. By incorporating category-specific masks, brightness control, and motion-flow control, DP-DeGauss achieves state-of-the-art disentanglement and outperforms existing methods in PSNR, SSIM, and LPIPS.

Key Contribution

Finally, a method disentangles dynamic egocentric scenes into background, hand, and object components, enabling fine-grained understanding and editing.

Abstract

Egocentric video is crucial for next-generation 4D scene reconstruction, with applications in AR/VR and embodied AI. However, reconstructing dynamic first-person scenes is challenging due to complex ego-motion, occlusions, and hand-object interactions. Existing decomposition methods are ill-suited, assuming fixed viewpoints or merging dynamics into a single foreground. To address these limitations, we introduce DP-DeGauss, a dynamic probabilistic Gaussian decomposition framework for egocentric 4D reconstruction. Our method initializes a unified 3D Gaussian set from COLMAP priors, augments each with a learnable category probability, and dynamically routes them into specialized deformation branches for background, hands, or object modeling. We employ category-specific masks for better disentanglement and introduce brightness and motion-flow control to improve static rendering and dynamic reconstruction. Extensive experiments show that DP-DeGauss outperforms baselines by +1.70dB in PSNR on average with SSIM and LPIPS gains. More importantly, our framework achieves the first and state-of-the-art disentanglement of background, hand, and object components, enabling explicit, fine-grained separation, paving the way for more intuitive ego scene understanding and editing.

Computer Vision Multimodal Models Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References22

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DP-DeGauss: Dynamic Probabilistic Gaussian Decomposition for Egocentric 4D Scene Reconstruction

Related Papers