Search papers, labs, and topics across Lattice.
The paper introduces Adaptive Auxiliary Prompt Blending (AAPB), a training-free framework to improve target-faithfulness in diffusion-based text-to-image generation, particularly for rare concepts or complex edits. AAPB adaptively blends auxiliary anchor prompts with the target prompt at each diffusion step, using a closed-form coefficient derived from Tweedie's identity to balance their influence. Experiments on RareBench and FlowEdit datasets demonstrate that AAPB achieves superior semantic accuracy and structural fidelity compared to existing training-free methods.
Diffusion models can now generate rare concepts and execute complex edits with greater fidelity, thanks to a training-free prompt blending technique that leverages statistical properties of the diffusion process itself.
Diffusion-based text-to-image (T2I) models have made remarkable progress in generating photorealistic and semantically rich images. However, when the target concepts lie in low-density regions of the training distribution, these models often produce semantically misaligned or structurally inconsistent results. This limitation arises from the long-tailed nature of text-image datasets, where rare concepts or editing instructions are underrepresented. To address this, we introduce Adaptive Auxiliary Prompt Blending (AAPB) - a unified framework that stabilizes the diffusion process in low-density regions. AAPB leverages auxiliary anchor prompts to provide semantic support in rare concept generation and structural support in image editing, ensuring faithful guidance toward the target prompt. Unlike prior heuristic prompt alternation methods, AAPB derives a closed-form adaptive coefficient that optimally balances the influence between the auxiliary anchor and the target prompt at each diffusion step. Grounded in Tweedie's identity, our formulation provides a principled and training-free framework for adaptive prompt blending, ensuring stable and target-faithful generation. We demonstrate the effectiveness of adaptive interpolation over fixed interpolation through controlled experiments and empirically show consistent improvements on the RareBench and FlowEdit datasets, achieving superior semantic accuracy and structural fidelity compared to prior training-free baselines.