Search papers, labs, and topics across Lattice.
This paper addresses the challenge of zero-shot traffic accident anticipation by leveraging a VideoMAE-v2 backbone combined with a per-frame prediction head, allowing the model to predict imminent collisions without requiring target-domain training data. The approach effectively utilizes a publicly available binary-labelled driving-accident dataset to generalize across unseen dashcam footage, demonstrating its scalability in safety-critical applications. Notably, the method secured 2nd place in the 2026 CVPR@AUTOPILOT Zero-Shot Accident Anticipation competition, highlighting its effectiveness in real-world scenarios.
Zero-shot learning can now predict traffic accidents in real-time without the need for costly annotated datasets, achieving competitive results in a major competition.
Traffic accident anticipation -- predicting the likelihood of an imminent collision at every frame of a dashcam video -- is safety-critical yet difficult to scale, because collecting in-domain annotated accident footage for every deployment scenario is prohibitively expensive. We study this task under a zero-shot setting where no target-domain training data is available: the model must learn exclusively from a publicly available binary-labelled driving-accident dataset and generalise to unseen dashcam footage. We propose a framework that bridges the gap between the frame-level temporal risk estimation task and coarsely labelled binary accident datasets by coupling a VideoMAE-v2 backbone with a per-frame prediction head under a sliding-window protocol. Our method achieves 2nd place in the 2026 CVPR@AUTOPILOT Zero-Shot Accident Anticipation competition. Code is available at https://github.com/TimeSouth/zero-shot-taa-solution.