Search papers, labs, and topics across Lattice.
The paper introduces Meta-Tool, a retrieval-augmented generation (RAG) based system, to enable large language models (LLMs) to retrieve and utilize appropriate tools from a predefined library for open-world function calling. They propose a hypothesize-retrieve-invoke framework within Meta-Tool to improve tool selection. The authors also present Meta-Bench, a new benchmark with 2,800 dialogues and 7,361 tools across ten scenarios, and MT-LLaMA, a fine-tuned LLaMA-3.1 model, demonstrating significant performance gains in tool retrieval and utilization compared to existing methods.
LLMs can now navigate a vast landscape of 7,000+ tools with a RAG-inspired system, outperforming previous methods in open-world function calling.
Large language models (LLMs) have show-cased remarkable capabilities as autonomous agents when augmented with external tools. Equipped with fixed tool sets, LLMs struggle with addressing diverse user inquiries in open-world tasks. To evaluate and boost the performance of LLMs in dealing with complex demands in the real-world, we propose open-world function calling, where LLMs need to retrieve suitable tools from a pre-defined external tool library and use retrieved tools to resolve the user’s problem. We introduce Meta-Tool, a versatile and plug-and-play tool retrieval system as the access of LLMs to external tool library. Drawing inspiration from the myriad of enhanced approaches associated with Retrieval-Augmented Generation (RAG), Meta-Tool employs a hypothesize-retrieve-invoke framework. We further pro-pose Meta-Bench, a comprehensive benchmark for evaluating LLMs in open-world function calling and associated tasks. Meta-Bench encompasses 2 , 800 dialogues and 7 , 361 tools, spanning ten distinct scenarios to provide robust and diverse test categories. In conjunction, we present MT-LLaMA, a finetuned version of LLaMA-3.1, which exhibits remarkable performance improvements. Our empirical experiments reveal that Meta-Tool significantly enhances the ability of advanced LLMs to retrieve and leverage the most suitable tools compared to previous tool retrieval methods. Moreover, our fine-tuning enables even smaller-sized LLMs to achieve comparable even exceeding