Apr 16, 2026arXiv:2604.14712

SGA-MCTS: Decoupling Planning from Execution via Training-Free Atomic Experience Retrieval

Xinghong Xie, Xin Xie, Dongyun Xue, Wuguannan Yao, Mingxiao Feng, Wengang Zhou, Xiang Qi, Houqiang Li

AI Summary

The paper introduces SGA-MCTS, a framework that decouples LLM planning from execution by retrieving and re-grounding de-lexicalized State-Goal-Action (SGA) atoms generated via offline Monte Carlo Tree Search (MCTS). These SGA atoms, representing reusable causal logic, are retrieved online using a hybrid symbolic-semantic mechanism to provide reasoning hints. Experiments show that SGA-MCTS allows frozen, open-weight models to achieve state-of-the-art performance on complex benchmarks without task-specific fine-tuning by amortizing the cost of search.

Key Contribution

Unleashing System 2 reasoning at System 1 speeds, SGA-MCTS lets frozen LLMs rival fine-tuned behemoths like GPT-4 on complex planning tasks.

Abstract

LLM-powered systems require complex multi-step decision-making abilities to solve real-world tasks, yet current planning approaches face a trade-off between the high latency of inference-time search and the limited generalization of supervised fine-tuning. To address this limitation, we introduce \textbf{SGA-MCTS}, a framework that casts LLM planning as non-parametric retrieval. Offline, we leverage Monte Carlo Tree Search (MCTS) to explore the solution space and distill high-fidelity trajectories into State-Goal-Action (SGA) atoms. These atoms are de-lexicalized primitives that abstract concrete entities into symbolic slots, preserving reusable causal logic while discarding domain-specific noise. Online, a retrieval-augmented agent employs a hybrid symbolic-semantic mechanism to fetch relevant SGAs and re-ground them into the current context as soft reasoning hints. Empirical results on complex benchmarks demonstrate that this paradigm enables frozen, open-weights models to match the performance of SOTA systems (e.g., GPT-5) without task-specific fine-tuning. By effectively amortizing the heavy computational cost of search, SGA-MCTS achieves System 2 reasoning depth at System 1 inference speeds, rendering autonomous planning both scalable and real-time feasible.

Reasoning & Chain-of-Thought Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References37

Year2026

VenueN/A

Related Papers

Finding related papers...