Search papers, labs, and topics across Lattice.
This paper identifies a novel supply-chain attack vector in tool-using LLM agents where malicious MCP tools, co-registered with normal tools, induce "overthinking loops" characterized by cyclic tool-call trajectories. The authors formalize this as a structural overthinking attack, distinct from token-level verbosity, and demonstrate its effectiveness using 14 malicious tools across three servers. Experiments with various tool-capable models and registries reveal significant resource amplification (up to 142.4x token usage) and task outcome degradation, highlighting the vulnerability of current tool-chaining mechanisms.
Maliciously crafted tools can hijack LLM agents into "overthinking loops," inflating token costs by over 100x without triggering typical safety filters.
Tool-using LLM agents increasingly coordinate real workloads by selecting and chaining third-party tools based on text-visible metadata such as tool names, descriptions, and return messages. We show that this convenience creates a supply-chain attack surface: a malicious MCP tool server can be co-registered alongside normal tools and induce overthinking loops, where individually trivial or plausible tool calls compose into cyclic trajectories that inflate end-to-end tokens and latency without any single step looking abnormal. We formalize this as a structural overthinking attack, distinguishable from token-level verbosity, and implement 14 malicious tools across three servers that trigger repetition, forced refinement, and distraction. Across heterogeneous registries and multiple tool-capable models, the attack causes severe resource amplification (up to $142.4\times$ tokens) and can degrade task outcomes. Finally, we find that decoding-time concision controls do not reliably prevent loop induction, suggesting defenses should reason about tool-call structure rather than tokens alone.