Amazon ScienceMar 18, 2026arXiv:2603.17239

LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

Hammad Atta, Hammad Atta, Ken Huang, Ken Huang, Kyriakos Rock Lambros, Kyriakos Rock Lambros, Yasir Mehmood, Yasir Mehmood, Zeeshan Baig, Z. Baig, Mohamed Abdur Rahman, Mohamed Abdur Rahman, Manish Bhatt, M. Aziz Ul Haq, M. A. U. Haq, Muhammad Aatif, Muhammad Aatif, Nadeem Shahzad, Nadeem Shahzad, Kamal Noor, Kamal Noor, Vineeth Sai Narajala, H. Ali, Hazem Ali, Jamel Abed, Jamel Abed

AI Summary

The paper introduces LAAF, a novel automated red-teaming framework designed to identify Logic-layer Prompt Control Injection (LPCI) vulnerabilities in agentic LLM systems. LAAF combines a comprehensive taxonomy of 49 LPCI techniques with a Persistent Stage Breaker (PSB) that iteratively mutates successful payloads to escalate attacks across multiple stages. Experiments on five production LLM platforms demonstrate LAAF's superior stage-breakthrough efficiency compared to random testing, achieving an 84% mean aggregate breakthrough rate.

Key Contribution

Agentic LLMs are surprisingly vulnerable: a new framework finds successful attacks in 84% of attempts by escalating prompt injection techniques across multiple stages.

Abstract

Agentic LLM systems equipped with persistent memory, RAG pipelines, and external tool connectors face a class of attacks - Logic-layer Prompt Control Injection (LPCI) - for which no automated red-teaming instrument existed. We present LAAF (Logic-layer Automated Attack Framework), the first automated red-teaming framework to combine an LPCI-specific technique taxonomy with stage-sequential seed escalation - two capabilities absent from existing tools: Garak lacks memory-persistence and cross-session triggering; PyRIT supports multi-turn testing but treats turns independently, without seeding each stage from the prior breakthrough. LAAF provides: (i) a 49-technique taxonomy spanning six attack categories (Encoding~11, Structural~8, Semantic~8, Layered~5, Trigger~12, Exfiltration~5; see Table 1), combinable across 5 variants per technique and 6 lifecycle stages, yielding a theoretical maximum of 2,822,400 unique payloads ($49 \times 5 \times 1{,}920 \times 6$; SHA-256 deduplicated at generation time); and (ii) a Persistent Stage Breaker (PSB) that drives payload mutation stage-by-stage: on each breakthrough, the PSB seeds the next stage with a mutated form of the winning payload, mirroring real adversarial escalation. Evaluation on five production LLM platforms across three independent runs demonstrates that LAAF achieves higher stage-breakthrough efficiency than single-technique random testing, with a mean aggregate breakthrough rate of 84\% (range 83--86\%) and platform-level rates stable within 17 percentage points across runs. Layered combinations and semantic reframing are the highest-effectiveness technique categories, with layered payloads outperforming encoding on well-defended platforms.

Recommendation & Information Retrieval Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References21

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

LAAF: Logic-layer Automated Attack Framework A Systematic Red-Teaming Methodology for LPCI Vulnerabilities in Agentic Large Language Model Systems

Related Papers