Apr 6, 2026arXiv:2604.04630

Multimodal Backdoor Attack on VLMs for Autonomous Driving via Graffiti and Cross-Lingual Triggers

Jiancheng Wang, Lidan Liang, Zengzhen Su, Haifeng Xia, Yuanting Yan, Wei Wang

AI Summary

This paper introduces GLA, a multimodal backdoor attack on VLMs for autonomous driving that uses graffiti-based visual triggers generated via stable diffusion inpainting and cross-lingual text triggers to maintain semantic consistency. GLA achieves a 90% Attack Success Rate (ASR) with a 10% poisoning ratio on DriveVLM while maintaining a 0% False Positive Rate (FPR). The attack even improves performance on clean tasks, making it difficult to detect with traditional methods.

Key Contribution

VLMs in self-driving cars are shockingly vulnerable: a subtle combination of graffiti and foreign-language commands can hijack their behavior without degrading performance on normal tasks.

Abstract

Visual language model (VLM) is rapidly being integrated into safety-critical systems such as autonomous driving, making it an important attack surface for potential backdoor attacks. Existing backdoor attacks mainly rely on unimodal, explicit, and easily detectable triggers, making it difficult to construct both covert and stable attack channels in autonomous driving scenarios. GLA introduces two naturalistic triggers: graffiti-based visual patterns generated via stable diffusion inpainting, which seamlessly blend into urban scenes, and cross-language text triggers, which introduce distributional shifts while maintaining semantic consistency to build robust language-side trigger signals. Experiments on DriveVLM show that GLA requires only a 10\% poisoning ratio to achieve a 90\% Attack Success Rate (ASR) and a 0\% False Positive Rate (FPR). More insidiously, the backdoor does not weaken the model on clean tasks, but instead improves metrics such as BLEU-1, making it difficult for traditional performance-degradation-based detection methods to identify the attack. This study reveals underestimated security threats in self-driving VLMs and provides a new attack paradigm for backdoor evaluation in safety-critical multimodal systems.

Computer Vision Multimodal Models Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multimodal Backdoor Attack on VLMs for Autonomous Driving via Graffiti and Cross-Lingual Triggers

Related Papers