Search papers, labs, and topics across Lattice.
This paper introduces CoTIR, a novel framework for image restoration that integrates Chain-of-Thought (CoT) reasoning within a single model, addressing the limitations of existing multi-step restoration approaches. By treating image restoration as a subtask of image editing and leveraging a pre-trained editing model, CoTIR enhances optimization and enables holistic restoration without the need for specialized modules. Experimental results demonstrate that CoTIR outperforms traditional all-in-one models and multi-round methods in both perceptual quality and fidelity across diverse degradation scenarios.
CoTIR achieves superior image restoration by internalizing reasoning processes, outperforming traditional methods even in complex degradation settings.
Image restoration seeks to recover high-quality images from degraded inputs but becomes highly ill-posed under complex, mixed degradations. While unified all-in-one models are common, their performance declines as degradation complexity increases. Recent works adopt Chain-of-Thought (CoT) reasoning for multi-round restoration using specialized modules. However, this approach faces two key limitations: (i) increased computational cost due to multi-step processing, and (ii) weak modeling of interactions between degradations during stepwise inference. We introduce CoTIR, a universal image restoration framework that internalizes CoT reasoning within a single model. Concretely, we view image restoration as a specialized subtask of image editing, which implies that a large-scale pre-trained editing model provides a more favorable optimization starting point. Building on this, we fine-tune the model for restoration and further encode structured CoT-style reasoning into the learning objective via a differentiable formulation inspired by Lagrangian optimization, enabling holistic restoration without chaining specialized restorers. To facilitate training and evaluation, we further present CoTIR-Bench, a large-scale benchmark comprising 5.2 million samples with CoT-style reasoning traces. Extensive experiments on CoTIR-Bench and broad real composite degradation scenes show that CoTIR achieves stronger perceptual quality and more competitive fidelity than both all-in-one models and multi-round restoration methods. The source code is available at https://github.com/gy65896/CoTIR.