Feb 23, 2026arXiv:2602.19575

ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization

Minseo Kim, Minchan Kwon, Dongyeun Lee, Yunho Jeon, Junmo Kim

AI Summary

The paper introduces ConceptPrism, a method for disentangling shared visual concepts from image-specific residuals in personalized text-to-image diffusion models. ConceptPrism jointly optimizes a target token representing the shared concept and image-wise residual tokens using a reconstruction loss for fidelity and a novel exclusion loss to force residual tokens to discard the shared concept. Experiments demonstrate that ConceptPrism improves the trade-off between concept fidelity and text alignment by effectively resolving concept entanglement without manual guidance.

Key Contribution

Ditch the manual segmentation masks: ConceptPrism automatically disentangles concepts in personalized diffusion models by comparing images, achieving better fidelity and alignment.

Abstract

Personalized text-to-image generation suffers from concept entanglement, where irrelevant residual information from reference images is captured, leading to a trade-off between concept fidelity and text alignment. Recent disentanglement approaches attempt to solve this utilizing manual guidance, such as linguistic cues or segmentation masks, which limits their applicability and fails to fully articulate the target concept. In this paper, we propose ConceptPrism, a novel framework that automatically disentangles the shared visual concept from image-specific residuals by comparing images within a set. Our method jointly optimizes a target token and image-wise residual tokens using two complementary objectives: a reconstruction loss to ensure fidelity, and a novel exclusion loss that compels residual tokens to discard the shared concept. This process allows the target token to capture the pure concept without direct supervision. Extensive experiments demonstrate that ConceptPrism effectively resolves concept entanglement, achieving a significantly improved trade-off between fidelity and alignment.

Computer Vision Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ConceptPrism: Concept Disentanglement in Personalized Diffusion Models via Residual Token Optimization

Related Papers