Search papers, labs, and topics across Lattice.
Syn4D, a new multiview synthetic dataset, is introduced to address the scarcity of high-quality datasets for dense 3D reconstruction and tracking of dynamic scenes from monocular video. The dataset provides ground-truth camera motion, depth maps, dense tracking, and parametric human pose annotations, enabling the unprojection of any pixel into 3D at any time and from any camera. Evaluations across 4D scene reconstruction, 3D point tracking, geometry-aware camera retargeting, and human pose estimation demonstrate the dataset's utility for dynamic scene understanding and spatiotemporal modeling.
Training on Syn4D could unlock breakthroughs in dynamic scene understanding, where current datasets fall short in providing dense, complete, and accurate geometric annotations.
Dense 3D reconstruction and tracking of dynamic scenes from monocular video remains an important open challenge in computer vision. Progress in this area has been constrained by the scarcity of high-quality datasets with dense, complete, and accurate geometric annotations. To address this limitation, we introduce Syn4D, a multiview synthetic dataset of dynamic scenes that includes ground-truth camera motion, depth maps, dense tracking, and parametric human pose annotations. A key feature of Syn4D is the ability to unproject any pixel into 3D to any time and to any camera. We conduct extensive evaluations across multiple downstream tasks to demonstrate the utility and effectiveness of the proposed dataset, including 4D scene reconstruction, 3D point tracking, geometry-aware camera retargeting, and human pose estimation. The experimental results highlight Syn4D's potential to facilitate research in dynamic scene understanding and spatiotemporal modeling.