Feb 23, 2026arXiv:2602.19633

TAPE: Tool-Guided Adaptive Planning and Constrained Execution in Language Model Agents

Jongwon Jeong, Jongwon Jeong, Jungtaek Kim, Jungtaek Kim, Kangwook Lee, Kangwook Lee

AI Summary

The paper introduces Tool-guided Adaptive Planning with constrained Execution (TAPE), a novel framework designed to improve the robustness of language model agents in environments with strict feasibility constraints. TAPE addresses imperfect planning by constructing a plan graph and using an external solver to find feasible paths, and tackles stochastic execution with constrained decoding and adaptive re-planning based on environmental feedback. Experiments on Sokoban, ALFWorld, MuSiQue, and GSM8K-Hard show that TAPE significantly outperforms existing frameworks, achieving an average improvement of 21.0 percentage points on hard settings.

Key Contribution

Language agents can now navigate complex, constrained environments with significantly improved success rates thanks to a new framework that combines multi-plan aggregation with constrained decoding and adaptive re-planning.

Abstract

Language Model (LM) agents have demonstrated remarkable capabilities in solving tasks that require multiple interactions with the environment. However, they remain vulnerable in environments where a single error often leads to irrecoverable failure, particularly under strict feasibility constraints. We systematically analyze existing agent frameworks, identifying imperfect planning and stochastic execution as the primary causes. To address these challenges, we propose Tool-guided Adaptive Planning with constrained Execution (TAPE). TAPE enhances planning capability by aggregating multiple plans into a graph and employing an external solver to identify a feasible path. During execution, TAPE employs constrained decoding to reduce sampling noise, while adaptively re-planning whenever environmental feedback deviates from the intended state. Experiments across Sokoban, ALFWorld, MuSiQue, and GSM8K-Hard demonstrate that TAPE consistently outperforms existing frameworks, with particularly large gains on hard settings, improving success rates by 21.0 percentage points on hard settings on average, and by 20.0 percentage points for weaker base models on average. Code and data available at here.

Reasoning & Chain-of-Thought Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References62

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

TAPE: Tool-Guided Adaptive Planning and Constrained Execution in Language Model Agents

Related Papers