Mar 10, 2026arXiv:2603.09695

DRIFT: Dual-Representation Inter-Fusion Transformer for Automated Driving Perception with 4D Radar Point Clouds

Siqi Pei, Andras Palffy, Dariu M. Gavrila

AI Summary

The paper introduces DRIFT, a dual-path transformer network for processing 4D radar point clouds in autonomous driving, designed to capture both local and global contextual information. DRIFT uses a point path for fine-grained local features and a pillar path for coarse-grained global features, fused through feature-sharing layers. Experiments on the View-of-Delft dataset demonstrate that DRIFT outperforms existing methods like CenterPoint, achieving a 52.6% mAP in object detection.

Key Contribution

DRIFT achieves state-of-the-art object detection performance on 4D radar point clouds by fusing local and global contexts with a novel dual-representation transformer architecture.

Abstract

4D radars, which provide 3D point cloud data along with Doppler velocity, are attractive components of modern automated driving systems due to their low cost and robustness under adverse weather conditions. However, they provide a significantly lower point cloud density than LiDAR sensors. This makes it important to exploit not only local but also global contextual scene information. This paper proposes DRIFT, a model that effectively captures and fuses both local and global contexts through a dual-path architecture. The model incorporates a point path to aggregate fine-grained local features and a pillar path to encode coarse-grained global features. These two parallel paths are intertwined via novel feature-sharing layers at multiple stages, enabling full utilization of both representations. DRIFT is evaluated on the widely used View-of-Delft (VoD) dataset and a proprietary internal dataset. It outperforms the baselines on the tasks of object detection and/or free road estimation. For example, DRIFT achieves a mean average precision (mAP) of 52.6\% (compared to, say, 45.4\% of CenterPoint) on the VoD dataset.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DRIFT: Dual-Representation Inter-Fusion Transformer for Automated Driving Perception with 4D Radar Point Clouds

Related Papers