Feb 17, 2026arXiv:2602.15945

From Tool Orchestration to Code Execution: A Study of MCP Design Choices

Yuval Felendler, Parth A. Gandhi, Idan Habler, Yuval Elovici, Asaf Shabtai

AI Summary

This paper formalizes the distinction between context-coupled and context-decoupled (CE-MCP) architectures for Model Context Protocols, which are used for agent tool orchestration, and analyzes their scalability trade-offs. It empirically evaluates the performance of both architectures using the MCP-Bench framework across 10 MCP servers, focusing on task behavior, tool utilization, latency, and protocol efficiency. The study reveals that CE-MCP reduces token usage and latency but introduces a larger attack surface, which the authors then address by applying the MAESTRO framework to identify and validate vulnerabilities, and propose a layered defense architecture.

Key Contribution

Code Execution MCPs slash token usage and latency in agent systems, but open a Pandora's Box of new attack vectors, demanding layered defenses.

Abstract

Model Context Protocols (MCPs) provide a unified platform for agent systems to discover, select, and orchestrate tools across heterogeneous execution environments. As MCP-based systems scale to incorporate larger tool catalogs and multiple concurrently connected MCP servers, traditional tool-by-tool invocation increases coordination overhead, fragments state management, and limits support for wide-context operations. To address these scalability challenges, recent MCP designs have incorporated code execution as a first-class capability, an approach called Code Execution MCP (CE-MCP). This enables agents to consolidate complex workflows, such as SQL querying, file analysis, and multi-step data transformations, into a single program that executes within an isolated runtime environment. In this work, we formalize the architectural distinction between context-coupled (traditional) and context-decoupled (CE-MCP) models, analyzing their fundamental scalability trade-offs. Using the MCP-Bench framework across 10 representative servers, we empirically evaluate task behavior, tool utilization patterns, execution latency, and protocol efficiency as the scale of connected MCP servers and available tools increases, demonstrating that while CE-MCP significantly reduces token usage and execution latency, it introduces a vastly expanded attack surface. We address this security gap by applying the MAESTRO framework, identifying sixteen attack classes across five execution phases-including specific code execution threats such as exception-mediated code injection and unsafe capability synthesis. We validate these vulnerabilities through adversarial scenarios across multiple LLMs and propose a layered defense architecture comprising containerized sandboxing and semantic gating. Our findings provide a rigorous roadmap for balancing scalability and security in production-ready executable agent workflows.

Code Generation & Program Synthesis Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

From Tool Orchestration to Code Execution: A Study of MCP Design Choices

Related Papers