HanyangMar 19, 2026arXiv:2603.19157

ADAPT: Attention Driven Adaptive Prompt Scheduling and InTerpolating Orthogonal Complements for Rare Concepts Generation

Kwanyoung Lee, Hyunwoo Oh, SeungJu Cha, Sung-Hee Koh, Sungho Koh, Dong-Jin Kim

AI Summary

The paper introduces ADAPT, a training-free framework for generating rare compositional concepts in text-to-image synthesis using diffusion models. ADAPT addresses the limitations of LLM-based prompt scheduling by deterministically planning and semantically aligning prompt schedules based on attention scores and orthogonal components. Experiments on the RareBench benchmark demonstrate that ADAPT significantly improves the compositional generation of rare concepts while maintaining visual integrity.

Key Contribution

Ditch the finetuning: this training-free method uses attention scores to generate rare concepts in images with more precision and control than LLM-guided approaches.

Abstract

Generating rare compositional concepts in text-to-image synthesis remains a challenge for diffusion models, particularly for attributes that are uncommon in the training data. While recent approaches, such as R2F, address this challenge by utilizing LLM for prompt scheduling, they suffer from inherent variance due to the randomness of language models and suboptimal guidance from iterative text embedding switching. To address these problems, we propose the ADAPT framework, a training-free framework that deterministically plans and semantically aligns prompt schedules, providing consistent guidance to enhance the composition of rare concepts. By leveraging attention scores and orthogonal components, ADAPT significantly enhances compositional generation of rare concepts in the RareBench benchmark without additional training or fine-tuning. Through comprehensive experiments, we demonstrate that ADAPT achieves superior performance in RareBench and accurately reflects the semantic information of rare attributes, providing deterministic and precise control over the generation of rare compositions without compromising visual integrity.

Computer Vision Multimodal Models Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References27

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ADAPT: Attention Driven Adaptive Prompt Scheduling and InTerpolating Orthogonal Complements for Rare Concepts Generation

Related Papers