Tsinghua AIAnt GroupHangzhou Dianzi UniversityZJUMay 21, 2026arXiv:2605.22321

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

Jianan Ma, Xiaohu Du, Ruixiao Lin, Yaoxiang Bian, Jialuo Chen, Jingyi Wang, Xiaofang Yang, Shiwen Cui, Changhua Meng, Xinhao Deng

AI Summary

This paper introduces a multi-dimensional evasion framework (Temporal, Spatial, and Semantic) to evaluate the security vulnerabilities of LLM-based autonomous agents in stateful, multi-turn interactions. The authors construct A3S-Bench, a benchmark with 2,254 real-world agent execution trajectories, to systematically quantify these threats across 10 mainstream LLMs and 20 practical threat scenarios. Results show that the proposed evasion framework significantly elevates the average risk trigger rate from 28.3% to 52.6%, exposing critical architecture-level vulnerabilities.

Key Contribution

LLM-powered autonomous agents are alarmingly susceptible to multi-turn, context-aware attacks that bypass standard security measures, nearly doubling the risk trigger rate.

Abstract

As autonomous agents (e.g., OpenClaw) increasingly operate with deep system-level privileges to execute complex tasks, they introduce severe, unmitigated security risks. Current vulnerability analyses overwhelmingly focus on single-turn, stateless behaviors, overlooking the expanded attack surface inherent in stateful, multi-turn interactions and dynamic tool invocations. In this paper, we propose a novel, multi-dimensional evasion framework targeting LLM-based agent systems. We introduce three stealthy attack vectors: (1) Temporal evasion, which fragments malicious payloads across sequential interaction turns; (2) Spatial evasion, which conceals payloads within complex external artifacts that evade standard LLM parsing mechanisms; and (3) Semantic evasion, which obscures malicious intents beneath benign contextual noise. To systematically quantify these threats, we construct A3S-Bench, a comprehensive benchmark comprising 2,254 real-world agent execution trajectories. Evaluating a standard agent framework separately integrated with 10 mainstream LLM backbones against 20 practical threat scenarios, we demonstrate that our evasion framework elevates the average risk trigger rate from a 28.3\% baseline to 52.6\%. These findings reveal systemic, architecture-level vulnerabilities in current autonomous agent systems that existing defenses fail to address, highlighting an urgent need for defense mechanisms tailored to the unique threats.

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Benchmarking Autonomous Agents against Temporal, Spatial, and Semantic Evasions

Related Papers