Search papers, labs, and topics across Lattice.
Hefei University of Technology
2
0
3
STEDiff enhances text-to-image alignment without the need for costly fine-tuning, achieving remarkable semantic consistency even in complex prompts.
Sharper text-to-image alignment is now possible in diffusion models by explicitly aggregating related attention and isolating unrelated attention.