Department of Mobility Systems EngineeringTU MunichSep 29, 2025arXiv:2509.24313

Learning to Sample: Reinforcement Learning-Guided Sampling for Autonomous Vehicle Motion Planning

Korbinian Moller, Roland Stroop, Mattia Piccinini, Alexander Langmann, Johannes Betz

AI Summary

The paper introduces a reinforcement learning (RL)-guided sampling approach for autonomous vehicle motion planning to address the inefficiency of uniform or heuristic sampling in complex urban environments. They train an RL agent to guide the sampling process toward regions likely to yield feasible trajectories, using a world model (WM) based on a decodable deep set encoder to handle variable numbers of traffic participants. Results in the CommonRoad environment demonstrate a significant reduction in required samples (up to 99%) and runtime (up to 84%) while maintaining planning quality.

Key Contribution

By learning where to sample, autonomous vehicles can achieve up to 84% faster motion planning in complex urban environments without sacrificing safety or success rates.

Abstract

Sampling-based motion planning is a well-established approach in autonomous driving, valued for its modularity and analytical tractability. In complex urban scenarios, however, uniform or heuristic sampling often produces many infeasible or irrelevant trajectories. We address this limitation with a hybrid framework that learns where to sample while keeping trajectory generation and evaluation fully analytical and verifiable. A reinforcement learning (RL) agent guides the sampling process toward regions of the action space likely to yield feasible trajectories, while evaluation and final selection remains governed by deterministic feasibility checks and cost functions. We couple the RL sampler with a world model (WM) based on a decodable deep set encoder, enabling both variable numbers of traffic participants and reconstructable latent representations. The approach is evaluated in the CommonRoad simulation environment, showing up to 99% fewer required samples and a runtime reduction of up to 84% while maintaining planning quality in terms of success and collision-free rates. These improvements lead to faster, more reliable decision-making for autonomous vehicles in urban environments, achieving safer and more responsive navigation under real-world constraints. Code and trained artifacts are publicly available at: https://github.com/TUM-AVS/Learning-to-Sample

RLHF & Preference Learning Robotics & Embodied AI World Models & Planning

Citation Metrics

Citations0

Influential citations0

References38

Year2025

VenuearXiv.org

Related Papers

Finding related papers...

Search

Learning to Sample: Reinforcement Learning-Guided Sampling for Autonomous Vehicle Motion Planning

Related Papers