MelbourneApr 30, 2026arXiv:2604.27891

In-Context Prompting Obsoletes Agent Orchestration for Procedural Tasks

Simon Dennis, Michael Diamond, Rivaan Patil, Kevin D. Shabahang, Kevin Shabahang, Hao Guo

AI Summary

This paper compares agent orchestration frameworks (e.g., LangGraph) to a simpler in-context prompting approach for procedural tasks. The authors found that in-context prompting, where the entire procedure is included in the system prompt, outperforms agent orchestration across three domains: travel booking, Zoom technical support, and insurance claims processing. This suggests that for current frontier models, external orchestration is no longer necessary for multi-turn conversations with defined procedures.

Key Contribution

Agent orchestration frameworks might be overkill: simply including the entire procedure in the system prompt yields better performance on procedural tasks.

Abstract

Agent orchestration frameworks -- LangGraph, CrewAI, Google ADK, OpenAI Agents SDK, and others -- place an external orchestrator above the LLM, tracking state and injecting routing instructions at every turn. We present a controlled comparison showing that for procedural tasks, this architecture is dominated by a simpler alternative: putting the entire procedure in the system prompt and letting the model self-orchestrate. Across three domains -- travel booking (14 nodes), Zoom technical support (14 nodes), and insurance claims processing (55 nodes) -- we evaluate 200 conversations per condition using LLM-as-judge scoring on five quality criteria. The in-context approach scores 4.53--5.00 on a 5-point scale while a LangGraph orchestrator using the same model scores 4.17--4.84. The orchestrated system fails on 24% of travel, 9% of Zoom, and 17% of insurance conversations, compared to 11.5%, 0.5%, and 5% for the in-context baseline. While external orchestration may have been necessary for earlier models, advances in frontier model capabilities have made it unnecessary for multi-turn conversations following a defined procedure.

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

In-Context Prompting Obsoletes Agent Orchestration for Procedural Tasks

Related Papers