GIST GuangdongHalmstad UniversityIndependent ResearcherInstitute of Science TokyoMorgan StanleyPKUUMacauMay 25, 2026arXiv:2605.25922

Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models

Xiao Liu, Jiaxiang Liu, Boci Peng, Boren Hu, Yusong Wang, Xiwen Chen, Prayag Tiwari, Liming Zhang, Mingkun Xu

AI Summary

This paper introduces Closed-Loop Bidirectional Prompting (CLBP), a novel defense mechanism against adversarial attacks on Vision Language Models (VLMs) that leverages a dynamic feedback loop between visual and textual encoders. CLBP uses a Semantic Anchor to constrain cyclic updates and denoise feature corruption, enabling textual semantics to refine visual representations and vice versa. Experiments across 11 datasets demonstrate that CLBP achieves state-of-the-art adversarial robustness and generalization with a favorable accuracy-cost trade-off.

Key Contribution

VLMs can achieve state-of-the-art adversarial robustness by iteratively refining visual and textual representations through a closed-loop prompting mechanism, even with frozen encoders.

Abstract

Vision Language Models adapt well to downstream tasks but are highly vulnerable to adversarial perturbations that disrupt cross-modal semantic alignment. Existing defenses are largely unidirectional or structural, failing to exploit bidirectional cross-modal complementarity and instance-wise adaptive protection. To overcome the limitations of unidirectional and static defenses in adversarial settings, we propose Closed-Loop Bidirectional Prompting, casting robust adaptation as cross-modal agreement recovery via a dynamic feedback loop on frozen encoders. A Semantic Anchor is introduced as a stable prior to constrain cyclic updates and mitigate perturbation-induced feature corruption. Through anchor-based bootstrapping, textual semantics denoise visual representations, while the refined visuals enable instance-adaptive prompt updating, yielding a rectified and robust consensus. Extensive evaluations across 11 datasets validate state-of-the-art robustness and strong base-to-new generalization, while maintaining a favorable trade-off between computational cost and accuracy.

Computer Vision Multimodal Models Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Closed-Loop Bidirectional Prompting for Adversarial Robustness of Vision Language Models

Related Papers