Search papers, labs, and topics across Lattice.
The paper introduces a framework leveraging Stable Diffusion for generating synthetic image datasets for autonomous vehicle vision tasks, addressing the limitations of real-world data collection. It uses Stable Diffusion to generate images, SAM and CV AT for annotation, and YOLOv8 for evaluation on object detection and segmentation. Results show that the synthetic data achieves comparable or superior performance to real-world datasets, highlighting its potential as a scalable and cost-effective alternative.
Synthetic data generated via Stable Diffusion and segmented with SAM can outperform real-world data for training autonomous vehicle vision models, offering a scalable and cost-effective alternative.
This research focuses on addressing the challenges associated with training object detection and segmentation models for autonomous vehicles using real-world data, which is often difficult to collect, limited in diversity, and expensive to annotate. To overcome these limitations, we propose a novel framework for generating synthetic datasets using deep learning techniques and generative AI. Specifically, we utilize the Stable Diffusion models to generate synthetic images, and annotate them using a combination of the Segment Anything Model (SAM) and the CV AT tool. The annotations are further refined to produce label files, and the dataset is split into training and validation sets. For evaluation, the YOLOv8 is applied for object detection and segmentation tasks using the synthetic dataset. Additionally, the Salesforce/Blip image-captioning model is integrated to provide descriptive captions for the detected objects. Experimental results demonstrate that synthetic datasets generated by this approach not only provide coverage of complex and rare scenarios but also achieve comparable or superior performance compared to real-world datasets in object detection and segmentation tasks. These findings underscore the potential of synthetic data to serve as a cost-effective and scalable alternative to real-world data for autonomous vehicle training.