PolyUTJUZJUJun 8, 2026arXiv:2606.09151

Customization under Fire: Plugin Poisoning in Text-to-Image Ecosystem

Jiahao Chen, Xing He, Yong Yang, Xinfeng Li, Chunyi Zhou, Junhao Li, Zhe Ma, Tianyu Du, Shouling Ji

AI Summary

This paper systematically investigates the security vulnerabilities associated with Low-Rank Adaptation (LoRA) plugins in text-to-image (T2I) models, introducing PoisonLoRA as a framework to analyze plugin poisoning risks. It identifies two primary attack vectors: Concept Hijacking, which can manipulate public opinion through generated images, and Task Injection, which enables the creation of harmful content activated by secret keys. The study reveals that these malicious plugins can propagate like viruses, achieving nearly 100% attack success rates across multiple datasets and platforms while remaining undetected, posing significant risks to user trust and safety in the T2I ecosystem.

Key Contribution

Malicious LoRA plugins can hijack public sentiment and spread harmful content, achieving nearly 100% success rates without detection.

Abstract

The prosperity of text-to-image (T2I) models has fostered a vibrant share-and-play ecosystem centered on Low-Rank Adaptation (LoRA) plugins, which allow users to customize and share model capabilities with ease. This democratization, however, comes with a hidden but severe security risk. Malicious users could share and distribute seemingly benign LoRA plugins that contain hidden functionalities to poison the model-sharing market, like Civitai or Liblib, severely undermining the user trust that underpins this collaborative ecosystem and threatening the safety of countless downstream applications. Despite these risks, plugin poisoning in the real-world T2I ecosystem remains underexplored. This paper introduces PoisonLoRA, the first systematic study of LoRA plugin supply-chain risks that exploits the trust and characteristics within the T2I ecosystem. We identify two primary attack instances: (1) Concept Hijacking, where a hijacked LoRA could generate images to influence public opinion and spread propaganda, and (2) Task Injection, where a LoRA is injected to produce harmful content (e.g., NSFW images) only activated by a secret key. Critically, the malicious payload persists with virus-like propagation. Such propagations weaponize the very act of creative collaboration (e.g., LoRA merging) to spread its contagion, turning every remix into a new carrier. Extensive experiments validate that PoisonLoRA is both effective and stealthy. Specifically, we achieve approximately 100% attack success rates (ASR) on both Civitai and Liblib on 6 datasets across 4 scenarios, without being detected by the platforms. The poisoned LoRA demonstrates extreme robustness, with nearly 100% ASR even transferred to different base models and remixed more than 5 times.

Multimodal Models Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Customization under Fire: Plugin Poisoning in Text-to-Image Ecosystem

Related Papers