Search papers, labs, and topics across Lattice.
The paper introduces a hierarchical multi-agent task planning framework that combines LLMs for task decomposition and PDDL planners for action execution in multi-robot systems. To address LLM limitations, the framework uses a TextGrad-inspired approach to optimize prompts for individual agents in the lower layer, along with meta-prompt sharing for efficiency. Experiments on the MAT-THOR benchmark demonstrate that the proposed planner significantly outperforms the state-of-the-art LaMMA-P, achieving higher success rates across compound, complex, and vague tasks.
LLMs can now plan complex multi-robot tasks with 15% higher success rates on vague instructions by optimizing prompts with textual gradients and sharing meta-prompts across agents.
Multi-robot task planning requires decomposing natural-language instructions into executable actions for heterogeneous robot teams. Conventional Planning Domain Definition Language (PDDL) planners provide rigorous guarantees but struggle to handle ambiguous or long-horizon missions, while large language models (LLMs) can interpret instructions and propose plans but may hallucinate or produce infeasible actions. We present a hierarchical multi-agent LLM-based planner with prompt optimization: an upper layer decomposes tasks and assigns them to lower-layer agents, which generate PDDL problems solved by a classical planner. When plans fail, the system applies TextGrad-inspired textual-gradient updates to optimize each agent's prompt and thereby improve planning accuracy. In addition, meta-prompts are learned and shared across agents within the same layer, enabling efficient prompt optimization in multi-agent settings. On the MAT-THOR benchmark, our planner achieves success rates of 0.95 on compound tasks, 0.84 on complex tasks, and 0.60 on vague tasks, improving over the previous state-of-the-art LaMMA-P by 2, 7, and 15 percentage points respectively. An ablation study shows that the hierarchical structure, prompt optimization, and meta-prompt sharing contribute roughly +59, +37, and +4 percentage points to the overall success rate.