UIUCUMDApr 30, 2026arXiv:2604.28193

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images

Vinayak Gupta, Vinayak Gupta, Chih-Hao Lin, Chih-hao Lin, Shenlong Wang, Anand Bhattad, Anand Bhattad, Jia-Bin Huang, Jia-Bin Huang

AI Summary

GenWildSplat, a novel feed-forward framework, tackles the challenge of 3D scene reconstruction from sparse, unposed images by predicting depth, camera parameters, and 3D Gaussians in a canonical space. It uses learned geometric priors, an appearance adapter for handling varying lighting, and semantic segmentation to address transient objects. The model is trained with curriculum learning on synthetic and real data, achieving state-of-the-art feed-forward rendering quality on PhotoTourism and MegaScenes benchmarks without per-scene optimization.

Key Contribution

Forget per-scene optimization: GenWildSplat achieves state-of-the-art 3D reconstruction from sparse, unposed images in real-time using a purely feed-forward approach.

Abstract

Reconstructing 3D scenes from sparse, unposed images remains challenging under real-world conditions with varying illumination and transient occlusions. Existing methods rely on scene-specific optimization using appearance embeddings or dynamic masks, which requires extensive per-scene training and fails under sparse views. Moreover, evaluations on limited scenes raise questions about generalization. We present GenWildSplat, a feed-forward framework for sparse-view outdoor reconstruction that requires no per-scene optimization. Given unposed internet images, GenWildSplat predicts depth, camera parameters, and 3D Gaussians in a canonical space using learned geometric priors. An appearance adapter modulates appearance for target lighting conditions, while semantic segmentation handles transient objects. Through curriculum learning on synthetic and real data, GenWildSplat generalizes across diverse illumination and occlusion patterns. Evaluations on PhotoTourism and MegaScenes benchmark demonstrate state-of-the-art feed-forward rendering quality, achieving real-time inference without test-time optimization

Computer Vision Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Generalizable Sparse-View 3D Reconstruction from Unconstrained Images

Related Papers