Search papers, labs, and topics across Lattice.
This paper introduces an energy-aware imitation learning framework for steering prediction in autonomous driving, leveraging both event and frame data to overcome limitations of frame-based cameras. The core of the approach is an Energy-driven Cross-modality Fusion Module (ECFM) and an energy-aware decoder, designed to improve prediction reliability and safety. Experiments on DDD20 and DRFuser datasets show the proposed method achieves state-of-the-art performance in steering prediction.
Event cameras, fused with traditional frames using an energy-aware approach, can significantly boost the accuracy of autonomous vehicle steering prediction.
In autonomous driving, relying solely on frame-based cameras can lead to inaccuracies caused by factors like long exposure times, high-speed motion, and challenging lighting conditions. To address these issues, we introduce a bio-inspired vision sensor known as the event camera. Unlike conventional cameras, event cameras capture sparse, asynchronous events that provide a complementary modality to mitigate these challenges. In this work, we propose an energy-aware imitation learning framework for steering prediction that leverages both events and frames. Specifically, we design an Energy-driven Cross-modality Fusion Module (ECFM) and an energy-aware decoder to produce reliable and safe predictions. Extensive experiments on two public real-world datasets, DDD20 and DRFuser, demonstrate that our method outperforms existing state-of-the-art (SOTA) approaches. The codes and trained models will be released upon acceptance.