ZJUMay 26, 2026arXiv:2605.26754

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Zhe Yu, Zhengtao Yu, Wenpeng Xing, Gaolei Li, Shuguang Xiong, Hongzhi Wang, Xuyang Teng, Meng Han

AI Summary

The paper demonstrates that simply detecting poisoned documents in RAG systems is insufficient to prevent adversarial manipulation of generated outputs due to a "monitoring-control gap". To address this, they introduce CORDON-MAS, a compartmentalized RAG architecture that enforces information-flow control by separating evidence extraction, audit, and synthesis into distinct agents with restricted memory access. Experiments across five BEIR datasets show that CORDON-MAS significantly reduces attack success rates by 92.4% compared to standard RAG.

Key Contribution

Even when RAG models detect poisoned information, they still act on it, but a new architecture can close this "monitoring-control gap" and slash attack success by 92%.

Abstract

Retrieval-augmented generation (RAG) increasingly underpins high-stakes applications, yet remains vulnerable to Confundo-style poisoning where adversarially optimized documents manipulate generated outputs. Existing defenses assume that detecting poisoned evidence prevents harm. We show this assumption is incorrect: models exhibit a monitoring-control gap -- they can detect contradictions in retrieved evidence yet still act on poisoned claims. We introduce the Cordon Principle -- no agent capable of final synthesis may access untrusted natural-language evidence -- and realize it through CORDON-MAS, a compartmentalized framework that enforces this principle architecturally by separating evidence extraction, cross-source audit, and answer synthesis into agents with asymmetric memory privileges. Across five BEIR datasets, CORDON-MAS reduces attack success rate by 92.4\% relative to undefended RAG. This reframes RAG poisoning from a detection problem to an information-flow control problem.

Constitutional AI & AI Ethics Recommendation & Information Retrieval Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References19

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Cordon-MAS: Defending RAG against Knowledge Poisoning via Information-Flow Control

Related Papers