Search papers, labs, and topics across Lattice.
The NTIRE 2026 RAIM challenge introduced a benchmark for Multimodal Large Language Models (MLLMs) to perform professional image quality assessment (IQA) by comparing high-quality image pairs and providing expert-level rationales. Participants addressed comparative quality selection and interpretative reasoning, pushing MLLMs to mimic human expert cognition. Results from nearly 200 registrations and 2,500 submissions demonstrate significant advancements in the state-of-the-art for professional IQA using MLLMs.
Current image quality metrics struggle to articulate *why* one high-quality image is better than another, but this challenge shows MLLMs are closing the gap by providing expert-level explanations.
In this paper, we present an overview of the NTIRE 2026 challenge on the 3rd Restore Any Image Model in the Wild, specifically focusing on Track 1: Professional Image Quality Assessment. Conventional Image Quality Assessment (IQA) typically relies on scalar scores. By compressing complex visual characteristics into a single number, these methods fundamentally struggle to distinguish subtle differences among uniformly high-quality images. Furthermore, they fail to articulate why one image is superior, lacking the reasoning capabilities required to provide guidance for vision tasks. To bridge this gap, recent advancements in Multimodal Large Language Models (MLLMs) offer a promising paradigm. Inspired by this potential, our challenge establishes a novel benchmark exploring the ability of MLLMs to mimic human expert cognition in evaluating high-quality image pairs. Participants were tasked with overcoming critical bottlenecks in professional scenarios, centering on two primary objectives: (1) Comparative Quality Selection: reliably identifying the visually superior image within a high-quality pair; and (2) Interpretative Reasoning: generating grounded, expert-level explanations that detail the rationale behind the selection. In total, the challenge attracted nearly 200 registrations and over 2,500 submissions. The top-performing methods significantly advanced the state of the art in professional IQA. The challenge dataset is available at https://github.com/narthchin/RAIM-PIQA, and the official homepage is accessible at https://www.codabench.org/competitions/12789/.