KTHNTUFeb 24, 2026arXiv:2602.21127

"Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

Xinfeng Li, Shenyu Dai, Shenyu Dai, Kelong Zheng, Kelong Zheng, Yue Xiao, Yue Xiao, Gelei Deng, Gelei Deng, Wei Dong, Wei Dong, Xiaofeng Wang

AI Summary

This paper introduces Agent-Mediated Deception (AMD) as a novel attack surface in LLM-driven agentic systems and empirically investigates human susceptibility to such attacks. Through a large-scale study with 303 participants using the newly developed HAT-Lab platform across nine diverse scenarios, the authors demonstrate a low detection rate of AMD attacks (8.6%) and identify six cognitive failure modes contributing to user vulnerability. The study further suggests that interrupting workflows with low-cost verification warnings and experiential learning can significantly improve user awareness and caution against AMD.

Key Contribution

Humans are surprisingly vulnerable to deception by compromised LLM agents, with less than 10% detecting attacks even in high-stakes scenarios.

Abstract

Large language model (LLM) agents are rapidly becoming trusted copilots in high-stakes domains like software development and healthcare. However, this deepening trust introduces a novel attack surface: Agent-Mediated Deception (AMD), where compromised agents are weaponized against their human users. While extensive research focuses on agent-centric threats, human susceptibility to deception by a compromised agent remains unexplored. We present the first large-scale empirical study with 303 participants to measure human susceptibility to AMD. This is based on HAT-Lab (Human-Agent Trust Laboratory), a high-fidelity research platform we develop, featuring nine carefully crafted scenarios spanning everyday and professional domains (e.g., healthcare, software development, human resources). Our 10 key findings reveal significant vulnerabilities and provide future defense perspectives. Specifically, only 8.6% of participants perceive AMD attacks, while domain experts show increased susceptibility in certain scenarios. We identify six cognitive failure modes in users and find that their risk awareness often fails to translate to protective behavior. The defense analysis reveals that effective warnings should interrupt workflows with low verification costs. With experiential learning based on HAT-Lab, over 90% of users who perceive risks report increased caution against AMD. This work provides empirical evidence and a platform for human-centric agent security research.

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References106

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

"Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

Related Papers