Feb 23, 2026arXiv:2602.19672

SkillOrchestra: Learning to Route Agents via Skill Transfer

Jiayu Wang, Yifei Ming, Yifei Ming, Zixuan Ke, Zixuan Ke, Shafiq Joty, Shafiq Joty, Aws Albarghouthi, Aws Albarghouthi, Frederic Sala

AI Summary

The paper introduces SkillOrchestra, a framework for orchestrating agents by learning fine-grained skills from execution experience and modeling agent competence and cost under those skills. SkillOrchestra infers the skill demands of the current interaction and selects agents that best satisfy them under a performance-cost trade-off, addressing limitations of input-level routers and RL-trained orchestrators. Experiments across ten benchmarks demonstrate that SkillOrchestra outperforms state-of-the-art RL-based orchestrators with significant learning cost reduction.

Key Contribution

SkillOrchestra slashes the learning costs of AI agent orchestration by up to 700x while improving performance by explicitly modeling agent skills and costs, offering a more scalable and interpretable alternative to RL-based methods.

Abstract

Compound AI systems promise capabilities beyond those of individual models, yet their success depends critically on effective orchestration. Existing routing approaches face two limitations: (1) input-level routers make coarse query-level decisions that ignore evolving task requirements; (2) RL-trained orchestrators are expensive to adapt and often suffer from routing collapse, repeatedly invoking one strong but costly option in multi-turn scenarios. We introduce SkillOrchestra, a framework for skill-aware orchestration. Instead of directly learning a routing policy end-to-end, SkillOrchestra learns fine-grained skills from execution experience and models agent-specific competence and cost under those skills. At deployment, the orchestrator infers the skill demands of the current interaction and selects agents that best satisfy them under an explicit performance-cost trade-off. Extensive experiments across ten benchmarks demonstrate that SkillOrchestra outperforms SoTA RL-based orchestrators by up to 22.5% with 700x and 300x learning cost reduction compared to Router-R1 and ToolOrchestra, respectively. These results show that explicit skill modeling enables scalable, interpretable, and sample-efficient orchestration, offering a principled alternative to data-intensive RL-based approaches. The code is available at: https://github.com/jiayuww/SkillOrchestra.

RLHF & Preference Learning Tool Use & Agents

Citation Metrics

Citations0

Influential citations0

References53

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

SkillOrchestra: Learning to Route Agents via Skill Transfer

Related Papers