Search papers, labs, and topics across Lattice.
This paper introduces HyperTool, a novel executable interface that enhances tool-augmented LLM agents by consolidating multiple tool calls into a single model-visible execution unit. By addressing the execution-granularity mismatch inherent in traditional step-wise tool calls, HyperTool allows models to efficiently manage data flow and context, resulting in improved performance on cross-tool compositional tasks. Experimental results on the MCP-Universe benchmark demonstrate significant accuracy gains, with Qwen3-32B achieving an increase from 15.69% to 35.29% and Qwen3-8B from 9.93% to 33.33%, outperforming existing models like GPT-OSS and Kimi-k2.5.
HyperTool boosts multi-step tool use accuracy by over 100% in LLMs, transforming how agents interact with complex tool workflows.
Tool-augmented LLM agents commonly rely on step-wise atomic tool calls, where each invocation, observation, and value transfer is exposed in the main reasoning trace. This creates an \emph{execution-granularity mismatch}: locally deterministic tool workflows are unfolded into repeated model-visible decisions, consuming context and forcing the model to manage low-level dataflow in the trace. We introduce \textbf{HyperTool}, a unified executable MCP-style tool interface that changes the model-visible unit of tool execution. A model invokes HyperTool with a code block that can call existing tools through their original schemas, manipulate returned values, and pass intermediate results locally, folding deterministic tool subroutines into a single outer call. To train models to use this interface, we synthesize HyperTool-format trajectories from cross-tool compositional tasks and verify them in real MCP environments. On MCP-Universe, HyperTool improves average accuracy from 15.69\% to 35.29\% on Qwen3-32B and from 9.93\% to 33.33\% on Qwen3-8B, and surpass GPT-OSS and Kimi-k2.5 on average accuracy, showing that our HyperTool can substantially improve multi-step tool use.