Search papers, labs, and topics across Lattice.
The paper introduces Probe-Select, a plug-in module for text-to-image diffusion models that predicts final image quality scores from intermediate denoiser activations at early timesteps. This approach leverages the observation that early activations encode stable structural information correlated with final image fidelity, enabling the termination of unpromising seeds early in the generation process. Experiments across diffusion and flow-matching models demonstrate that Probe-Select reduces sampling costs by over 60% while improving the quality of retained images by accurately ranking candidate seeds.
Skip 80% of diffusion steps without sacrificing image quality: Probe-Select accurately predicts the final quality of generated images from early denoising steps, enabling significant computational savings.
Recent text-to-image (T2I) diffusion and flow-matching models can produce highly realistic images from natural language prompts. In practical scenarios, T2I systems are often run in a ``generate--then--select'' mode: many seeds are sampled and only a few images are kept for use. However, this pipeline is highly resource-intensive since each candidate requires tens to hundreds of denoising steps, and evaluation metrics such as CLIPScore and ImageReward are post-hoc. In this work, we address this inefficiency by introducing Probe-Select, a plug-in module that enables efficient evaluation of image quality within the generation process. We observe that certain intermediate denoiser activations, even at early timesteps, encode a stable coarse structure, object layout and spatial arrangement--that strongly correlates with final image fidelity. Probe-Select exploits this property by predicting final quality scores directly from early activations, allowing unpromising seeds to be terminated early. Across diffusion and flow-matching backbones, our experiments show that early evaluation at only 20\% of the trajectory accurately ranks candidate seeds and enables selective continuation. This strategy reduces sampling cost by over 60\% while improving the quality of the retained images, demonstrating that early structural signals can effectively guide selective generation without altering the underlying generative model. Code is available at https://github.com/Guhuary/ProbeSelect.