Search papers, labs, and topics across Lattice.
This paper introduces DiffuSAM, a diffusion-based approach to adapt SAM2 for prompt-free medical image segmentation. DiffuSAM leverages a diffusion prior to synthesize segmentation mask-like embeddings from frozen SAM2 image features, which are then fed into SAM2's mask decoder. Experiments on BTCV and CHAOS datasets demonstrate that DiffuSAM achieves competitive performance in Source-Free Unsupervised Domain Adaptation and Few-Shot settings.
Ditch the prompts: DiffuSAM adapts SAM2 for medical image segmentation by synthesizing mask embeddings with a diffusion model, achieving strong performance without fine-tuning or expert input.
Segmentation models such as Segment Anything Model (SAM) and SAM2 achieve strong prompt-driven zero-shot performance. However, their training on natural images limits domain transfer to medical data. Consequently, accurate segmentation typically requires extensive fine-tuning and expert-designed prompts. We propose DiffuSAM, a diffusion-based adaptation of SAM2 for prompt-free medical image segmentation. Our framework synthesizes SAM2-compatible segmentation mask-like embeddings via a lightweight diffusion-prior from off-the-shelf frozen SAM2 image features. The generated embeddings are integrated into SAM2's mask decoder to produce accurate segmentations, thereby eliminating the need for user prompts. The diffusion prior is further conditioned on previously segmented slices, enforcing spatial consistency across volumes. Evaluated on the BTCV and CHAOS datasets for CT and MRI under Source-Free Unsupervised Domain Adaptation (SF-UDA) and Few-Shot settings, DiffuSAM achieves competitive performance with efficient training and inference. Code is available upon request from the corresponding author.