Morgan State UniversityMar 17, 2025arXiv:2508.06170

Synthetic data-driven multi-architecture framework for automated polyp segmentation through integrated detection and mask generation

Ojonugwa Oluwafemi Ejiga Peter, Akingbola Oluwapemiisin, Amalahu Chetachi, O. Adeniran, Fahmi Khalifa, M. Rahman

AI Summary

This paper introduces a multi-architecture framework for automated polyp segmentation in colonoscopy images, addressing the challenges of limited datasets and annotation complexities. The framework leverages Stable Diffusion for synthetic data generation, Faster R-CNN for polyp detection, and the Segment Anything Model (SAM) for mask refinement. Experiments comparing U-Net, PSPNet, FPN, LinkNet, and MANet (with ResNet34 backbone) for segmentation revealed that FPN achieved the highest PSNR and SSIM, while LinkNet showed balanced IoU and Dice scores, demonstrating the effectiveness of the integrated approach.

Key Contribution

Synthetic data generation via Stable Diffusion can overcome data limitations in medical image segmentation, achieving comparable accuracy to real datasets in polyp detection.

Abstract

Colonoscopy is a vital tool for the early diagnosis of colorectal cancer, which is one of the main causes of cancer-related mortality globally, hence it is deemed an essential technique for the prevention and early detection of colorectal cancer. The research introduces a unique multidirectional architectural framework to automate polyp detection within colonoscopy images while helping resolve limited healthcare dataset sizes and annotation complexities. The research implements a comprehensive system that delivers synthetic data generation through Stable Diffusion enhancements together with detection and segmentation algorithms. This detection approach combines Faster R-CNN for initial object localization while the Segment Anything Model (SAM) refines the segmentation masks. The faster R-CNN detection algorithm achieved a recall of 93.08% combined with a precision of 88.97% and an F1 score of 90.98%.SAM is then used to generate the image mask. The research evaluated five state-of-the-art segmentation models that included U-Net, PSPNet, FPN, LinkNet, and MANet using ResNet34 as a base model. The results demonstrate the superior performance of FPN with the highest scores of PSNR (7.205893) and SSIM (0.492381), while UNet excels in recall (84.85%) and LinkNet shows balanced performance in IoU (64.20%) and Dice score (77.53%). This framework achieves its primary breakthrough through a synthetic data system and an automatic ground truth generator, as these methods combat data limitations without sacrificing medical accuracy. The framework unites multiple architectures together with extensive evaluation metrics to set new benchmarks, which should speed up medical image segmentation tools across different medical specialties.

Architecture Design (Transformers, SSMs, MoE)Computer Vision Data Curation & Synthetic Data

Citation Metrics

Citations5

Influential citations0

References21

Year2025

VenueMedical Imaging

Related Papers

Finding related papers...

Search

Synthetic data-driven multi-architecture framework for automated polyp segmentation through integrated detection and mask generation

Related Papers