Search papers, labs, and topics across Lattice.
Free-Range Gaussians introduces a generative 3D reconstruction method that predicts non-grid-aligned 3D Gaussians from multi-view images by using flow matching over Gaussian parameters. This generative approach allows for supervision with non-grid-aligned 3D data and enables plausible content synthesis in unobserved regions, addressing limitations of grid-aligned methods. The method incorporates hierarchical patching to reduce sequence length and timestep-weighted rendering loss, photometric gradient guidance, and classifier-free guidance to improve fidelity, demonstrating superior performance over existing methods on Objaverse and Google Scanned Objects datasets, particularly in scenarios with limited viewpoints.
Ditch the grid: Free-Range Gaussians synthesizes plausible 3D content from sparse views by predicting non-grid-aligned Gaussians, filling in the blanks where other methods fall apart.
We present Free-Range Gaussians, a multi-view reconstruction method that predicts non-pixel, non-voxel-aligned 3D Gaussians from as few as four images. This is done through flow matching over Gaussian parameters. Our generative formulation of reconstruction allows the model to be supervised with non-grid-aligned 3D data, and enables it to synthesize plausible content in unobserved regions. Thus, it improves on prior methods that produce highly redundant grid-aligned Gaussians, and suffer from holes or blurry conditional means in unobserved regions. To handle the number of Gaussians needed for high-quality results, we introduce a hierarchical patching scheme to group spatially related Gaussians into joint transformer tokens, halving the sequence length while preserving structure. We further propose a timestep-weighted rendering loss during training, and photometric gradient guidance and classifier-free guidance at inference to improve fidelity. Experiments on Objaverse and Google Scanned Objects show consistent improvements over pixel and voxel-aligned methods while using significantly fewer Gaussians, with large gains when input views leave parts of the object unobserved.