Clemson UniversityApr 30, 2026arXiv:2604.27414

Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis

David Fernandez, Pedram MohajerAnsari, Amir Salarpour, M. Pesé, Mert D. Pese

AI Summary

This paper investigates the cross-architecture transferability of physical adversarial attacks on Vision-Language Models (VLMs) used in autonomous driving. They evaluate three VLM architectures (Dolphins, OmniDrive, and LeapVAD) with physically realizable patch attacks on roadside infrastructure in crosswalk and highway scenarios. The study reveals high cross-architecture transfer rates (73-91%) and asymmetric architecture-level vulnerabilities, indicating a systemic weakness in current VLM-based autonomous driving systems.

Key Contribution

Architectural diversity offers surprisingly little defense against adversarial attacks on VLMs for autonomous driving, with physical patches transferring effectively across different models.

Abstract

Vision-language models (VLMs) are increasingly used in autonomous driving because they combine visual perception with language-based reasoning, supporting more interpretable decision-making, yet their robustness to physical adversarial attacks, especially whether such attacks transfer across different VLM architectures, is not well understood and poses a practical risk when attackers do not know which model a vehicle uses. We address this gap with a systematic cross-architecture study of adversarial transferability in VLM-based driving, evaluating three representative architectures (Dolphins, OmniDrive, and LeapVAD) using physically realizable patches placed on roadside infrastructure in both crosswalk and highway scenarios. Our transfer-matrix evaluation shows high cross-architecture effectiveness, with transfer rates of 73–91% (mean TR = 0.815 for crosswalk and 0.833 for highway) and sustained frame-level manipulation over 64.7–79.4% of the critical decision window even when patches are not optimized for the target model. We further find asymmetric architecture-level risk, with Dolphins most vulnerable to incoming transfer attacks (VS = 0.82) and LeapVAD producing the most transferable patches (TO = 0.882), while models sharing CLIP-based vision encoders exhibit stronger bidirectional transfer. Overall, these results indicate that current VLM-based autonomous driving systems share systematic cross-architecture weaknesses that architectural diversity alone does not resolve, underscoring the need for defenses and design principles that explicitly account for transferability in safety-critical deployment.

Computer Vision Multimodal Models Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References1

Year2026

VenueSAE technical paper series

Related Papers

Finding related papers...

Search

Understanding Adversarial Transferability in Vision-Language Models for Autonomous Driving: A Cross-Architecture Analysis

Related Papers