Search papers, labs, and topics across Lattice.
NoPo4D is introduced as the first feed-forward system to jointly address dynamic content, multi-view input, and unknown camera poses for 3D scene reconstruction. It achieves this by decomposing Gaussian motion into image-plane shifts and depth changes, enabling supervision from pseudo ground-truth optical flow. The system incorporates a bidirectional motion encoder and view-dependent opacity, outperforming existing feed-forward methods and approaching per-scene optimization results with significantly faster processing times.
Finally, a feed-forward method cracks dynamic 3D scene reconstruction from multi-view video without needing camera poses, opening the door to real-time applications.
Recent feed-forward 3D gaussian splatting methods have made dramatic progress on individual aspects of 3D scene reconstruction, but no existing method jointly addresses dynamic content, multi-view input, and unknown camera poses in a single feed-forward pass. Methods that handle dynamics either require accurate camera poses or accept only monocular input; pose-free multi-view methods address only static scenes; and per-scene optimization methods bridge some of these gaps but at minutes-to-hours cost per scene. We introduce NoPo4D, the first feed-forward system that addresses this empty quadrant. Building on a pretrained geometry backbone and recent 4D Gaussian frameworks, NoPo4D introduces a velocity decomposition that splits Gaussian motion into per-pixel image-plane shifts and depth changes, allowing direct supervision from pseudo ground-truth optical flow on the 2D component. This sidesteps both the differentiable rendering that couples prior posed methods to pose accuracy and the 3D motion ground truth that prior pose-free methods require. The system is rounded out by a bidirectional motion encoder for cross-view and cross-frame feature aggregation, and view-dependent opacity that mitigates cross-view and cross-timestep Gaussian misalignments. On four multi-view dynamic benchmarks, NoPo4D consistently outperforms prior feed-forward baselines, and with an optional post-optimization stage surpasses per-scene optimization methods, while running orders of magnitude faster.