Search papers, labs, and topics across Lattice.
ReinDriveGen enables controllable generation of driving scenes with edited actor trajectories by using multi-frame LiDAR data to construct dynamic 3D point cloud scenes, completing vehicle geometries, and rendering 2D condition images for a video diffusion model. To address the out-of-distribution nature of edited scenes, they introduce an RL-based post-training strategy using pairwise preference and reward mechanisms. Experiments show ReinDriveGen outperforms existing methods in edited driving scenarios and achieves state-of-the-art results in novel ego viewpoint synthesis.
Generate safety-critical driving scenarios with full trajectory control, even *beyond* your training data, using RL to fine-tune a video diffusion model.
We present ReinDriveGen, a framework that enables full controllability over dynamic driving scenes, allowing users to freely edit actor trajectories to simulate safety-critical corner cases such as front-vehicle collisions, drifting cars, vehicles spinning out of control, pedestrians jaywalking, and cyclists cutting across lanes. Our approach constructs a dynamic 3D point cloud scene from multi-frame LiDAR data, introduces a vehicle completion module to reconstruct full 360掳 geometry from partial observations, and renders the edited scene into 2D condition images that guide a video diffusion model to synthesize realistic driving videos. Since such edited scenarios inevitably fall outside the training distribution, we further propose an RL-based post-training strategy with a pairwise preference model and a pairwise reward mechanism, enabling robust quality improvement under out-of-distribution conditions without ground-truth supervision. Extensive experiments demonstrate that ReinDriveGen outperforms existing approaches on edited driving scenarios and achieves state-of-the-art results on novel ego viewpoint synthesis.