Fujitsu Research of EuropeFeb 24, 2026arXiv:2602.21447

Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG

Inderjeet Singh, Vikas Pahuja, Aishvariya Priya Rathina Sabapathy, Chiara Picardi, Amit Giloni, Roman Vainshtein, Andrés Murillo, Hisashi Kojima, Motoyoshi Sekiya, Yuki Unno, Junichi Suga

AI Summary

The paper addresses the vulnerability of multimodal agentic RAG systems to adversarial attacks that distribute malicious intent across retrieval, planning, and generation stages. They model this as a POMDP and introduce MMA-RAG^T, a Modular Trust Agent (MTA) that maintains a belief state to infer adversarial intent from noisy observations. Experiments on a large dataset show MMA-RAG^T achieves a 6.5x reduction in attack success rate compared to undefended baselines, highlighting the importance of stateful defense mechanisms.

Key Contribution

Stateful defenses are not just better, but *necessary* to protect multimodal RAG agents from sophisticated attacks that bypass stateless filters.

Abstract

Current stateless defences for multimodal agentic RAG fail to detect adversarial strategies that distribute malicious semantics across retrieval, planning, and generation components. We formulate this security challenge as a Partially Observable Markov Decision Process (POMDP), where adversarial intent is a latent variable inferred from noisy multi-stage observations. We introduce MMA-RAG^T, an inference-time control framework governed by a Modular Trust Agent (MTA) that maintains an approximate belief state via structured LLM reasoning. Operating as a model-agnostic overlay, MMA-RAGT mediates a configurable set of internal checkpoints to enforce stateful defence-in-depth. Extensive evaluation on 43,774 instances demonstrates a 6.50x average reduction factor in Attack Success Rate relative to undefended baselines, with negligible utility cost. Crucially, a factorial ablation validates our theoretical bounds: while statefulness and spatial coverage are individually necessary (26.4 pp and 13.6 pp gains respectively), stateless multi-point intervention can yield zero marginal benefit under homogeneous stateless filtering when checkpoint detections are perfectly correlated.

Multimodal Models Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG

Related Papers