UofTMay 21, 2026arXiv:2605.22420

Diffusion-guided Generalizable Enhancer for Urban Scene Reconstruction

Henry Che, Jingkang Wang, Yun Chen, Ze Yang, Sivabalan Manivasagam, Raquel Urtasun

AI Summary

The paper introduces GenRe, a diffusion-guided generalizable enhancer for urban scene reconstruction that improves the fidelity of 3D Gaussian representations under large viewpoint shifts. GenRe distills generative priors learned across diverse scenes into a pre-trained 3D Gaussian representation, avoiding costly per-scene optimization and improving generalization to unseen viewpoints. Experiments demonstrate that GenRe outperforms existing methods in reconstruction quality and efficiency, enabling more robust sensor simulation for autonomous driving.

Key Contribution

Forget painstakingly optimizing each urban scene individually: GenRe distills diffusion-based generative priors into 3D Gaussian representations, fixing reconstruction deficiencies in minutes and generalizing to challenging viewpoints.

Abstract

Urban scene reconstruction from real-world observations has emerged as a powerful tool for self-driving development and testing. While current neural rendering approaches achieve high-fidelity rendering along the recorded trajectories, their quality degrades significantly under large viewpoint shifts, limiting the applicability for closed-loop simulation. Recent works have shown promising results in using diffusion models to enhance quality at these challenging viewpoints and distill improvements back into 3D representations. However, they often require costly per-scene optimization, and the distilled representations remain fragile and fail to generalize beyond limited synthesized views. To address these limitations, we propose GenRe, a novel diffusion-guided generalizable enhancer for urban scene reconstruction. GenRe takes as input any pretrained 3D Gaussian representation and fixes the deficiencies within a few minutes. By learning to distill generative priors across diverse scenes, GenRe produces robust and high-fidelity representation efficiently that generalizes reliably to challenging unseen viewpoints (e.g., lane change). Experiments show that GenRe outperforms existing methods in both quality and efficiency and benefits various downstream tasks, enabling robust and scalable sensor simulation for autonomous driving.

Computer Vision World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Diffusion-guided Generalizable Enhancer for Urban Scene Reconstruction

Related Papers