Search papers, labs, and topics across Lattice.
This paper introduces a novel approach to view synthesis using a differentiable Multiplane Image (MPI) representation, which significantly enhances rendering speed and reduces model size compared to existing methods like 3D Gaussian Splatting. By leveraging predicted point maps for geometric initialization and incorporating one-step diffusion to mitigate artifacts, the authors achieve a synthesis that is 30.7% faster and only 14.8% the size of the competing method while maintaining competitive quality. This advancement addresses the challenges of sparse-view conditions and the practical deployment of view synthesis on mobile devices.
Achieving 30.7% faster rendering with only 14.8% of the model size, this method revolutionizes the efficiency of novel view synthesis in sparse-view scenarios.
Recently, novel view synthesis has witnessed remarkable progress, with mainstream methods such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS) delivering impressive results. However, these approaches often struggle to balance rendering speed and model size, and their optimization-based training can be highly time-consuming. Furthermore, they typically rely on dense observations, often failing to produce satisfactory results under sparse-view conditions. Although feed-forward reconstruction significantly reduces the optimization time of 3DGS, its pixel-aligned formulation generates millions of Gaussians from a single image, severely limiting its practical deployment on mobile devices. To address these limitations, we revisit the Multiplane Image(MPI) representation, which represents scenes using a compact set of planar layers for efficient novel view synthesis. Leveraging recent advances in visual foundation models, we utilize predicted point maps for reliable geometric initialization, followed by differentiable optimization. To address the issues of holes and artifacts in sparsely initialized MPI, we introduce one-step diffusion, which participates in both the differentiable optimization of MPI and the postprocessing of rendering results. Compared with a representative GS-based method, our approach is 30.7% faster and uses only 14.8% of its model size, while achieving competitive synthesis quality on front-view scenarios