Mar 4, 2026arXiv:2603.03714

Order Is Not Layout: Order-to-Space Bias in Image Generation

Yongkang Zhang, Zonglin Zhao, Yuecheng Zhang, Yuechen Zhang, Fei Ding, Pei Li, Wenxuan Wang

AI Summary

The paper identifies and quantifies a previously undocumented "Order-to-Space Bias" (OTS) in image generation models, where the order of entities mentioned in a text prompt influences their spatial arrangement in the generated image. OTS-Bench, a new benchmark, is introduced to measure this bias by evaluating homogenization (similarity of images generated from prompts with different entity orders) and correctness (adherence to grounded cues). Experiments demonstrate that OTS is prevalent in text-to-image and image-to-image models, is data-driven, and emerges early in the generation process; targeted fine-tuning and early-stage intervention are shown to mitigate the bias.

Key Contribution

Image generation models exhibit a surprising "Order-to-Space Bias," meaning the order you mention objects in a prompt can drastically alter their placement in the generated image, even overriding other visual cues.

Abstract

We study a systematic bias in modern image generation models: the mention order of entities in text spuriously determines spatial layout and entity--role binding. We term this phenomenon Order-to-Space Bias (OTS) and show that it arises in both text-to-image and image-to-image generation, often overriding grounded cues and causing incorrect layouts or swapped assignments. To quantify OTS, we introduce OTS-Bench, which isolates order effects with paired prompts differing only in entity order and evaluates models along two dimensions: homogenization and correctness. Experiments show that Order-to-Space Bias (OTS) is widespread in modern image generation models, and provide evidence that it is primarily data-driven and manifests during the early stages of layout formation. Motivated by this insight, we show that both targeted fine-tuning and early-stage intervention strategies can substantially reduce OTS, while preserving generation quality.

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Citation Metrics

Citations0

Influential citations0

References47

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Order Is Not Layout: Order-to-Space Bias in Image Generation

Related Papers