DeepMindUT AustinMar 4, 2026arXiv:2603.03704

Large-Language-Model-Guided State Estimation for Partially Observable Task and Motion Planning

Yoonwoo Kim, Raghav Arora, Peter Stone, Ben Abbatematteo

AI Summary

This paper introduces CoCo-TAMP, a hierarchical state estimation framework that uses LLMs to guide belief shaping over task-relevant objects in partially observable environments. CoCo-TAMP leverages LLMs to encode common-sense knowledge about object co-location and typical locations, improving the efficiency of task and motion planning. Experiments demonstrate a significant reduction in planning and execution time, achieving an average reduction of 62.7% in simulation and 72.6% in real-world demonstrations compared to a baseline without common-sense knowledge.

Key Contribution

LLMs can drastically accelerate robot planning in cluttered environments by injecting common-sense priors about object locations and co-occurrences, slashing planning time by up to 72% in real-world experiments.

Abstract

Robot planning in partially observable environments, where not all objects are known or visible, is a challenging problem, as it requires reasoning under uncertainty through partially observable Markov decision processes. During the execution of a computed plan, a robot may unexpectedly observe task-irrelevant objects, which are typically ignored by naive planners. In this work, we propose incorporating two types of common-sense knowledge: (1) certain objects are more likely to be found in specific locations; and (2) similar objects are likely to be co-located, while dissimilar objects are less likely to be found together. Manually engineering such knowledge is complex, so we explore leveraging the powerful common-sense reasoning capabilities of large language models (LLMs). Our planning and execution framework, CoCo-TAMP, introduces a hierarchical state estimation that uses LLM-guided information to shape the belief over task-relevant objects, enabling efficient solutions to long-horizon task and motion planning problems. In experiments, CoCo-TAMP achieves an average reduction of 62.7 in planning and execution time in simulation, and 72.6 in real-world demonstrations, compared to a baseline that does not incorporate either type of common-sense knowledge.

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Large-Language-Model-Guided State Estimation for Partially Observable Task and Motion Planning

Related Papers