Search papers, labs, and topics across Lattice.
ALOOD addresses the problem of overconfident object detection in LiDAR-based autonomous driving systems when encountering out-of-distribution (OOD) objects. It aligns object features from a LiDAR object detector with the feature space of a vision-language model (VLM), enabling OOD object detection as a zero-shot classification task. Experiments on the nuScenes OOD benchmark demonstrate competitive performance, indicating the effectiveness of leveraging language representations for this task.
LiDAR object detectors can now spot the unexpected by borrowing language understanding from vision-language models, turning OOD detection into a zero-shot game.
LiDAR-based 3D object detection plays a critical role for reliable and safe autonomous driving systems. However, existing detectors often produce overly confident predictions for objects not belonging to known categories, posing significant safety risks. This is caused by so-called out-of-distribution (OOD) objects, which were not part of the training data, resulting in incorrect predictions. To address this challenge, we propose ALOOD (Aligned LiDAR representations for Out-Of-Distribution Detection), a novel approach that incorporates language representations from a vision-language model (VLM). By aligning the object features from the object detector to the feature space of the VLM, we can treat the detection of OOD objects as a zero-shot classification task. We demonstrate competitive performance on the nuScenes OOD benchmark, establishing a novel approach to OOD object detection in LiDAR using language representations. The source code is available at https://github.com/uulm-mrm/mmood3d.