JHUUChicagoUCSBUniversity of GeorgiaApr 6, 2026arXiv:2604.04426

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

Zhuowen Yuan, Zhaorun Chen, Zhen Xiang, Nathaniel D. Bastian, Seyyed Hadi Hashemi, Chaowei Xiao, Wenbo Guo

AI Summary

The paper introduces SC-Inject-Bench, a benchmark with over 10,000 malicious MCP tools, to evaluate supply-chain attacks in LLM agent systems, revealing the inadequacy of existing MCP scanners and semantic guardrails. To address this, they propose ShieldNet, a network-level guardrail framework that uses a MITM proxy and event extractor to observe network interactions and classify attacks. Experiments demonstrate ShieldNet's superior detection performance (0.995 F-1 score) with low overhead compared to existing methods.

Key Contribution

Your agent's shiny new tool could be a Trojan horse: ShieldNet spots supply-chain attacks by watching network traffic, blowing away existing defenses.

Abstract

Existing research on LLM agent security mainly focuses on prompt injection and unsafe input/output behaviors. However, as agents increasingly rely on third-party tools and MCP servers, a new class of supply-chain threats has emerged, where malicious behaviors are embedded in seemingly benign tools, silently hijacking agent execution, leaking sensitive data, or triggering unauthorized actions. Despite their growing impact, there is currently no comprehensive benchmark for evaluating such threats. To bridge this gap, we introduce SC-Inject-Bench, a large-scale benchmark comprising over 10,000 malicious MCP tools grounded in a taxonomy of 25+ attack types derived from MITRE ATT&CK targeting supply-chain threats. We observe that existing MCP scanners and semantic guardrails perform poorly on this benchmark. Motivated by this finding, we propose ShieldNet, a network-level guardrail framework that detects supply-chain poisoning by observing real network interactions rather than surface-level tool traces. ShieldNet integrates a man-in-the-middle (MITM) proxy and an event extractor to identify critical network behaviors, which are then processed by a lightweight classifier for attack detection. Extensive experiments show that ShieldNet achieves strong detection performance (up to 0.995 F-1 with only 0.8% false positives) while introducing little runtime overhead, substantially outperforming existing MCP scanners and LLM-based guardrails.

Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

ShieldNet: Network-Level Guardrails against Emerging Supply-Chain Injections in Agentic Systems

Related Papers