Mar 9, 2026arXiv:2603.08316

SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

Junxian Li, T.Y. Lan, Tu Lan, Haozhen Tan, Yan Meng, Haojin Zhu

AI Summary

This paper introduces SlowBA, a novel backdoor attack against VLM-based GUI agents that focuses on manipulating response latency rather than action correctness. The attack uses a two-stage reward-level backdoor injection strategy to induce excessively long reasoning chains when specific, realistic GUI pop-up windows are presented as triggers. Experiments show SlowBA significantly increases response length and latency with minimal impact on task accuracy, even with low poisoning ratios and under defenses.

Key Contribution

VLM-based GUI agents are vulnerable to "SlowBA," a backdoor attack that stealthily inflates response times without affecting task accuracy, revealing a new dimension of security risk beyond action correctness.

Abstract

Modern vision-language-model (VLM) based graphical user interface (GUI) agents are expected not only to execute actions accurately but also to respond to user instructions with low latency. While existing research on GUI-agent security mainly focuses on manipulating action correctness, the security risks related to response efficiency remain largely unexplored. In this paper, we introduce SlowBA, a novel backdoor attack that targets the responsiveness of VLM-based GUI agents. The key idea is to manipulate response latency by inducing excessively long reasoning chains under specific trigger patterns. To achieve this, we propose a two-stage reward-level backdoor injection (RBI) strategy that first aligns the long-response format and then learns trigger-aware activation through reinforcement learning. In addition, we design realistic pop-up windows as triggers that naturally appear in GUI environments, improving the stealthiness of the attack. Extensive experiments across multiple datasets and baselines demonstrate that SlowBA can significantly increase response length and latency while largely preserving task accuracy. The attack remains effective even with a small poisoning ratio and under several defense settings. These findings reveal a previously overlooked security vulnerability in GUI agents and highlight the need for defenses that consider both action correctness and response efficiency. Code can be found in https://github.com/tu-tuing/SlowBA.

Multimodal Models Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References69

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

Related Papers