Search papers, labs, and topics across Lattice.
This paper introduces compositional multi-concept erasure, a new task that addresses the limitation of existing concept erasure methods which assume only a single target concept per image. They propose CoME-Bench, a benchmark for evaluating this task, and Mosaic, a framework that exploits the spatial locality of target concepts in the vector field of flow-based T2I models. Mosaic dynamically constructs concept-specific masks and selectively blends them to remove multiple target concepts in complex scenes without additional optimization, demonstrating effective multi-concept erasure while preserving non-target contexts.
Erasing multiple unwanted concepts from a single generated image is now possible, even when those concepts interact in complex ways.
Concept erasure has emerged as a key research direction for ensuring safe and ethical image synthesis in Text-to-Image (T2I) models. While existing studies have explored concept erasure across multiple concepts, they typically assume only a single target concept per image, a limitation increasingly exposed by modern flow-based T2I models, which can generate complex scenes with multiple concepts simultaneously. To address this gap, we introduce compositional multi-concept erasure, a new task that aims to simultaneously remove multiple target concepts within a single scene. We propose CoME-Bench, a benchmark for evaluating compositional multi-concept erasure, which covers both intra- and cross-category scenarios. We further propose Mosaic, a novel framework for multi-concept erasure in flow-based T2I models, which exploits the spatial locality of target concepts in the vector field by dynamically constructing concept-specific masks and selectively blending them without additional optimization. Extensive experiments demonstrate that Mosaic effectively removes multiple target concepts in complex compositional scenes while preserving non-target contexts.