May 25, 2026arXiv:2605.25393

Decision-Making with Lightweight Confidence-Aware Language Model for Autonomous Driving

Ruoyu Yao, Ruiguo Zhong, Mingxing Peng, Jun Ma

AI Summary

This paper introduces a decision-making framework for autonomous driving that distills the reasoning capabilities of large language models into a lightweight, confidence-aware language model. The framework uses a multi-agent system to generate high-quality, confidence-annotated decision demonstrations via chain-of-thought reasoning, which are then used to fine-tune a dual-head language model with RAG. Experiments on the nuPlan benchmark show the approach achieves state-of-the-art success rates with low inference latency, even in long-tail scenarios.

Key Contribution

You can get SOTA autonomous driving performance with a distilled, lightweight language model that also tells you how confident it is.

Abstract

Large Language Models (LLMs) and Multimodal LLMs (MLLMs) have demonstrated immense potential in autonomous driving (AD) by offering human-like reasoning and open-world generalization. However, the excessive computational overhead and high inference latency of these massive models severely hinder their deployment in resource-constrained AD systems. To address this challenge, we propose a novel decision-making framework utilizing a lightweight confidence-aware language model, which bridges the gap between complex multimodal intention reasoning and efficient inference. Specifically, we design a multi-agent collaborative workflow, comprising action voting, confidence assessment, and summarization agents, to generate high-quality, confidence-annotated decision demonstrations via explicit Chain-of-Thought (CoT) reasoning. These demonstrations are then distilled into a lightweight language model featuring a dual-head architecture, enabling the joint prediction of decision probabilities and the generation of textual rationales. The distillation is realized via a confidence-aware fine-tuning strategy coupled with Retrieval Augmented Generation (RAG) to enhance the model's adaptability and data efficiency. Comprehensive closed-loop experiments on the nuPlan benchmark demonstrate that our approach achieves state-of-the-art (SOTA) success rates in both regular and long-tail scenarios while maintaining low inference latency.

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Decision-Making with Lightweight Confidence-Aware Language Model for Autonomous Driving

Related Papers