Search papers, labs, and topics across Lattice.
This paper presents a large-scale benchmark and analysis of Generative Image Restoration (GIR) models, evaluating them across detail, sharpness, semantic correctness, and overall quality. The study compares diffusion-based, GAN-based, PSNR-oriented, and general-purpose generative models, revealing performance disparities and a shift in failure modes from under-generation to over-generation. The authors also train a new Image Quality Assessment (IQA) model aligned with human perception.
Generative image restoration has flipped from struggling to add detail to now struggling to control the quality and semantics of the details it hallucinates.
Generative Image Restoration (GIR) has achieved impressive perceptual realism, but how far have its practical capabilities truly advanced compared with previous methods? To answer this, we present a large-scale study grounded in a new multi-dimensional evaluation pipeline that assesses models on detail, sharpness, semantic correctness, and overall quality. Our analysis covers diverse architectures, including diffusion-based, GAN-based, PSNR-oriented, and general-purpose generation models, revealing critical performance disparities. Furthermore, our analysis uncovers a key evolution in failure modes that signifies a paradigm shift for the perception-oriented low-level vision field. The central challenge is evolving from the previous problem of detail scarcity (under-generation) to the new frontier of detail quality and semantic control (preventing over-generation). We also leverage our benchmark to train a new IQA model that better aligns with human perceptual judgments. Ultimately, this work provides a systematic study of modern generative image restoration models, offering crucial insights that redefine our understanding of their true state and chart a course for future development.