Search papers, labs, and topics across Lattice.
The paper introduces Team-of-Thoughts, a multi-agent system architecture that dynamically leverages heterogeneous agents with specialized tool-calling capabilities, addressing the limitations of static, homogeneous MAS configurations. It optimizes performance through orchestrator calibration to identify superior coordination models and a self-assessment protocol for tool agents to profile their domain expertise. Experiments on reasoning and code generation benchmarks demonstrate that Team-of-Thoughts significantly outperforms homogeneous baselines, achieving substantial accuracy gains on AIME24 and LiveCodeBench.
Forget static, homogeneous multi-agent systems: Team-of-Thoughts unlocks superior performance by dynamically orchestrating heterogeneous agents based on calibrated coordination and self-assessed domain expertise.
Existing Multi-Agent Systems (MAS) typically rely on static, homogeneous model configurations, limiting their ability to exploit the distinct strengths of differently post-trained models. To address this, we introduce Team-of-Thoughts, a novel MAS architecture that leverages the complementary capabilities of heterogeneous agents via an orchestrator-tool paradigm. Our framework introduces two key mechanisms to optimize performance: (1) an orchestrator calibration scheme that identifies models with superior coordination capabilities, and (2) a self-assessment protocol where tool agents profile their own domain expertise to account for variations in post-training skills. During inference, the orchestrator dynamically activates the most suitable tool agents based on these proficiency profiles. Experiments on five reasoning and code generation benchmarks show that Team-of-Thoughts delivers consistently superior task performance. Notably, on AIME24 and LiveCodeBench, our approach achieves accuracies of 96.67% and 72.53%, respectively, substantially outperforming homogeneous role-play baselines, which score 80% and 65.93%.