Mar 10, 2026arXiv:2603.09481

GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models

Andrew Murray, Danial Dervovic, Alberto Pozanco, Michael Cashmore

AI Summary

GenePlan uses LLM-assisted evolutionary algorithms to generate domain-dependent generalized planners from PDDL descriptions. It evolves interpretable Python planners to minimize plan length across diverse problem instances, effectively framing generalized planning as an optimization problem. GenePlan achieves a 0.91 average SAT score across eight domains, closely matching state-of-the-art planners while significantly outperforming LLM-based baselines like chain-of-thought prompting.

Key Contribution

LLMs can evolve surprisingly effective, interpretable Python planners that rival state-of-the-art classical planners, at a fraction of the computational cost.

Abstract

We present GenePlan (GENeralized Evolutionary Planner), a novel framework that leverages large language model (LLM) assisted evolutionary algorithms to generate domain-dependent generalized planners for classical planning tasks described in PDDL. By casting generalized planning as an optimization problem, GenePlan iteratively evolves interpretable Python planners that minimize plan length across diverse problem instances. In empirical evaluation across six existing benchmark domains and two new domains, GenePlan achieved an average SAT score of 0.91, closely matching the performance of the state-of-the-art planners (SAT score 0.93), and significantly outperforming other LLM-based baselines such as chain-of-thought (CoT) prompting (average SAT score 0.64). The generated planners solve new instances rapidly (average 0.49 seconds per task) and at low cost (average $1.82 per domain using GPT-4o).

Code Generation & Program Synthesis Reasoning & Chain-of-Thought World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

GenePlan: Evolving Better Generalized PDDL Plans using Large Language Models

Related Papers