Search papers, labs, and topics across Lattice.
The paper introduces FunReason-MT, a novel data synthesis framework designed to generate high-quality, multi-turn training data for function calling in large language models, addressing limitations in existing methods like random sampling and multi-agent role-playing. FunReason-MT employs Environment-API Graph Interactions, Advanced Tool-Query Synthesis, and Guided Iterative Chain to overcome challenges in targeted data synthesis, hard query construction, and multi-turn logical dependency. Experiments on BFCLv3 and BFCLv4 show that models trained on FunReason-MT data achieve state-of-the-art performance among comparable-sized models, demonstrating the framework's effectiveness in agentic learning.
Forget random sampling – this framework crafts targeted, multi-turn function-calling data that catapults smaller LLMs to state-of-the-art performance.
Function calling (FC) empowers large language models (LLMs) and autonomous agents to interface with external tools, a critical capability for solving complex, real-world problems. As this ability becomes increasingly central to advanced AI systems, the need for high-quality, multi-turn training data to develop and refine it cannot be overstated. Existing data synthesis methods, such as random environment sampling or multi-agent role-playing, are not powerful enough to generate high-quality data in real-world environments. Practical challenges come in three folds: targeted data synthesis, hard query construction, and multi-turn logical dependency. To address these structural deficiencies, we present FunReason-MT, a novel data synthesis framework for real-world multi-turn tool use. FunReason-MT resolves the complexity barrier in multi-turn FC data by employing 1) Environment-API Graph Interactions to gather varied high-quality trajectories with targeted tool, 2) Advanced Tool-Query Synthesis to simplify hard query construction, and 3) Guided Iterative Chain for sophisticated CoT generation. Evaluations on Berkeley Function-Calling Leaderboard (BFCLv3) demonstrate the power of our framework: a 4B model built upon FunReason-MT generated data achieves state-of-the-art performance among comparable-sized models. Further performance improvements on BFCLv4 confirm that FunReason-MT provides a reliable and robust source for agentic learning.