Search papers, labs, and topics across Lattice.
The paper introduces EditSpilloverProbe, a framework to evaluate world knowledge in image editing models by analyzing how they alter semantically related content outside the specified edit region. They create a taxonomy of spillover types (spatial, semantic, mixed, random) and a benchmark dataset, EditSpilloverBench, using real-world Chinese text editing tasks. Experiments on five models reveal varying spillover rates and a trade-off between editing control and semantic spillover, with semantic spillover demonstrating genuine world understanding rather than spatial diffusion.
Image editing models leak fascinating hints about their world knowledge through "edit spillover"鈥攗nintended changes to semantically related regions鈥攁nd this paper turns that leakage into a probe.
Instruction-following image editing models are expected to modify only the specified region while keeping the rest of the image unchanged. However, in practice, we observe a pervasive phenomenon -- edit spillover: models alter semantically related but unspecified content outside the edit region. This raises a fundamental question -- does spillover reflect genuine implicit world understanding, or is it merely attention leakage? We propose EditSpilloverProbe, a systematic framework that repurposes edit spillover as a natural probe for world knowledge in image editing models. We introduce a spillover taxonomy (spatial, semantic, mixed, random), an automated detection-and-classification pipeline, and a benchmark dataset constructed from real-world Chinese text editing tasks, EditSpilloverBench. Systematic evaluation of 5 representative editing models reveals three core findings: (1) spillover rates vary dramatically across architectures, from 3.49% to 11.46%, with a 3.3x ratio; (2) absolute semantic spillover quantity reveals models' world understanding capability -- nano_banana produces the most semantic spillover (27.8 per image), while qwen_2511 has the most precise editing control but lower semantic spillover (16.3 per image), revealing a trade-off between editing control and world understanding; (3) spatial decay analysis shows spillover area density decays exponentially with distance, but the proportion of semantically relevant spillover remains constant (40%-58%), providing direct evidence that semantic spillover reflects genuine world understanding rather than spatial diffusion.