Search papers, labs, and topics across Lattice.
The paper introduces WorldTree, a novel framework for dynamic scene reconstruction from monocular video that addresses limitations in spatiotemporal decomposition. WorldTree uses a Temporal Partition Tree (TPT) for coarse-to-fine temporal optimization and Spatial Ancestral Chains (SAC) for hierarchical spatial dynamics. Experiments demonstrate that WorldTree achieves state-of-the-art performance, improving LPIPS by 8.26% on NVIDIA-LS and mLPIPS by 9.09% on DyCheck compared to existing methods.
Monocular dynamic scene reconstruction gets a unified spatiotemporal decomposition framework, achieving up to 9% improvement in reconstruction quality.
Dynamic reconstruction has achieved remarkable progress, but there remain challenges in monocular input for more practical applications. The prevailing works attempt to construct efficient motion representations, but lack a unified spatiotemporal decomposition framework, suffering from either holistic temporal optimization or coupled hierarchical spatial composition. To this end, we propose WorldTree, a unified framework comprising Temporal Partition Tree (TPT) that enables coarse-to-fine optimization based on the inheritance-based partition tree structure for hierarchical temporal decomposition, and Spatial Ancestral Chains (SAC) that recursively query ancestral hierarchical structure to provide complementary spatial dynamics while specializing motion representations across ancestral nodes. Experimental results on different datasets indicate that our proposed method achieves 8.26% improvement of LPIPS on NVIDIA-LS and 9.09% improvement of mLPIPS on DyCheck compared to the second-best method. Code: https://github.com/iCVTEAM/WorldTree.