Comparison to State of the Art AMar 16, 2026arXiv:2603.15185

What Matters for Scalable and Robust Learning in End-to-End Driving Planners?

David Holtz, Niklas Hanselmann, Simon Doll, Marius Cordts, Bernt Schiele

AI Summary

This paper investigates the impact of architectural choices—high-resolution perceptual representations, disentangled trajectory representations, and generative planning—on the closed-loop performance of end-to-end autonomous driving systems. The authors find that open-loop performance gains do not always translate to robust closed-loop driving and identify limitations and synergies among these architectural patterns. Based on these findings, they introduce BevAD, a lightweight and scalable architecture that achieves a 72.7% success rate on the Bench2Drive benchmark using imitation learning.

Key Contribution

Seemingly beneficial architectural choices in end-to-end driving planners, like high-resolution perception, can actually hinder robust closed-loop performance, demanding a re-evaluation of design principles.

Abstract

End-to-end autonomous driving has gained significant attention for its potential to learn robust behavior in interactive scenarios and scale with data. Popular architectures often build on separate modules for perception and planning connected through latent representations, such as bird's eye view feature grids, to maintain end-to-end differentiability. This paradigm emerged mostly on open-loop datasets, with evaluation focusing not only on driving performance, but also intermediate perception tasks. Unfortunately, architectural advances that excel in open-loop often fail to translate to scalable learning of robust closed-loop driving. In this paper, we systematically re-examine the impact of common architectural patterns on closed-loop performance: (1) high-resolution perceptual representations, (2) disentangled trajectory representations, and (3) generative planning. Crucially, our analysis evaluates the combined impact of these patterns, revealing both unexpected limitations as well as underexplored synergies. Building on these insights, we introduce BevAD, a novel lightweight and highly scalable end-to-end driving architecture. BevAD achieves 72.7% success rate on the Bench2Drive benchmark and demonstrates strong data-scaling behavior using pure imitation learning. Our code and models are publicly available here: https://dmholtz.github.io/bevad/

Robotics & Embodied AI Training Efficiency & Optimization World Models & Planning

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

What Matters for Scalable and Robust Learning in End-to-End Driving Planners?

Related Papers