Search papers, labs, and topics across Lattice.
The paper introduces Agentic Resource & Configuration learner (ARC), a reinforcement learning-based approach, to dynamically configure LLM-based agent systems on a per-query basis. ARC learns a hierarchical policy to optimize workflows, tools, token budgets, and prompts, addressing the limitations of static, hand-tuned configurations. Experiments across reasoning and tool-augmented question answering benchmarks demonstrate that ARC achieves up to 25% higher task accuracy and reduces token and runtime costs compared to strong baselines.
Stop wasting compute: Reinforcement learning can dynamically configure LLM agents on a per-query basis, boosting accuracy by 25% while slashing token and runtime costs.
Configuring LLM-based agent systems involves choosing workflows, tools, token budgets, and prompts from a large combinatorial design space, and is typically handled today by fixed large templates or hand-tuned heuristics. This leads to brittle behavior and unnecessary compute, since the same cumbersome configuration is often applied to both easy and hard input queries. We formulate agent configuration as a query-wise decision problem and introduce ARC (Agentic Resource&Configuration learner), which learns a light-weight hierarchical policy using reinforcement learning to dynamically tailor these configurations. Across multiple benchmarks spanning reasoning and tool-augmented question answering, the learned policy consistently outperforms strong hand-designed and other baselines, achieving up to 25% higher task accuracy while also reducing token and runtime costs. These results demonstrate that learning per-query agent configurations is a powerful alternative to"one size fits all"designs.