UBCMar 9, 2026arXiv:2603.08993

Arbiter: Detecting Interference in LLM Agent System Prompts

AI Summary

Arbiter, a novel framework, uses formal evaluation rules and multi-model LLM scouring to identify interference patterns in LLM agent system prompts, addressing the lack of testing infrastructure for these crucial software artifacts. Applying Arbiter to Claude Code, Codex CLI, and Gemini CLI, the study uncovered 152 findings in undirected scouring and 21 hand-labeled interference patterns in directed analysis. The research reveals that prompt architecture influences failure class, while multi-model evaluation uncovers distinct vulnerability classes compared to single-model analysis, highlighting a structural data loss issue in Gemini CLI that was previously patched without addressing the root cause.

Key Contribution

For pennies, a new framework reveals critical vulnerabilities in the system prompts of leading coding agents like Claude, Codex, and Gemini, demonstrating the power of multi-model LLM scouring.

Abstract

System prompts for LLM-based coding agents are software artifacts that govern agent behavior, yet lack the testing infrastructure applied to conventional software. We present Arbiter, a framework combining formal evaluation rules with multi-model LLM scouring to detect interference patterns in system prompts. Applied to three major coding agent system prompts: Claude Code (Anthropic), Codex CLI (OpenAI), and Gemini CLI (Google), we identify 152 findings across the undirected scouring phase and 21 hand-labeled interference patterns in directed analysis of one vendor. We show that prompt architecture (monolithic, flat, modular) strongly correlates with observed failure class but not with severity, and that multi-model evaluation discovers categorically different vulnerability classes than single-model analysis. One scourer finding was structural data loss in Gemini CLI's memory system was consistent with an issue filed and patched by Google, which addressed the symptom without addressing the schema-level root cause identified by the scourer. Total cost of cross-vendor analysis: \$0.27 USD.

Code Generation & Program Synthesis Red-Teaming & Adversarial Robustness Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References14

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Arbiter: Detecting Interference in LLM Agent System Prompts

Related Papers