NSFCMar 18, 2026arXiv:2603.17828

TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models

Qianlong Xiang, Miao Zhang, Haoyu Zhang, Junhui Hou, Liqiang Nie

AI Summary

The paper introduces TINA, a Text-free Inversion Attack, to regenerate erased concepts from text-to-image diffusion models that have undergone concept erasure techniques. TINA leverages DDIM inversion under a null-text condition to bypass text-centric defenses and integrates an optimization procedure to mitigate approximation errors in the absence of textual guidance. Experiments demonstrate that TINA can successfully regenerate erased concepts even in models treated with state-of-the-art unlearning methods, revealing the limitations of current text-centric erasure techniques.

Key Contribution

Concept erasure in text-to-image models is mostly smoke and mirrors: a text-free attack can still regenerate "forgotten" concepts by exploiting the model's latent visual knowledge.

Abstract

Although text-to-image diffusion models exhibit remarkable generative power, concept erasure techniques are essential for their safe deployment to prevent the creation of harmful content. This has fostered a dynamic interplay between the development of erasure defenses and the adversarial probes designed to bypass them, and this co-evolution has progressively enhanced the efficacy of erasure methods. However, this adversarial co-evolution has converged on a narrow, text-centric paradigm that equates erasure with severing the text-to-image mapping, ignoring that the underlying visual knowledge related to undesired concepts still persist. To substantiate this claim, we investigate from a visual perspective, leveraging DDIM inversion to probe whether a generative pathway for the erased concept can still be found. However, identifying such a visual generative pathway is challenging because standard text-guided DDIM inversion is actively resisted by text-centric defenses within the erased model. To address this, we introduce TINA, a novel Text-free INversion Attack, which enforces this visual-only probe by operating under a null-text condition, thereby avoiding existing text-centric defenses. Moreover, TINA integrates an optimization procedure to overcome the accumulating approximation errors that arise when standard inversion operates without its usual textual guidance. Our experiments demonstrate that TINA regenerates erased concepts from models treated with state-of-the-art unlearning. The success of TINA proves that current methods merely obscure concepts, highlighting an urgent need for paradigms that operate directly on internal visual knowledge.

Computer Vision Multimodal Models Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models

Related Papers