Search papers, labs, and topics across Lattice.
This paper evaluates the robustness of the Segment Anything Model (SAM) for spleen segmentation in abdominal CT scans under simulated domain shifts. They applied controlled perturbations like noise, blur, and contrast changes to CT images and measured the impact on segmentation accuracy using Dice score and failure rate. Results show SAM maintains stable segmentation performance with minimal Dice score reduction (螖Dice < 0.01) and no significant increase in failure probability across the tested perturbations, suggesting robustness to common CT imaging variations.
Despite concerns about domain shift in medical imaging, SAM (ViT-B) demonstrates surprisingly robust spleen segmentation in abdominal CT scans even under simulated inter-scanner variations.
Foundation segmentation models such as the Segment Anything Model (SAM) have demonstrated strong generalization across natural images; however, their robustness under clinically realistic medical imaging domain shifts remains insufficiently quantified. We present a systematic slice-level robustness audit of SAM (ViT-B) for spleen segmentation in abdominal CT using 1,051 nonempty slices from 41 volumes in the Medical Segmentation Decathlon. A standardized ground-truth-derived bounding-box protocol was used to isolate encoder robustness from prompt uncertainty. Controlled perturbations simulating inter-scanner variability, including Gaussian noise, blur, contrast scaling, gamma correction, and resolution mismatch, were applied across ten conditions. The clean baseline achieved a mean Dice score of 0.9145 (95% CI: [0.909, 0.919]) with a failure rate of 0.67%. Across all perturbations, the absolute mean 螖Dice remained below 0.01. Paired Wilcoxon signed-rank tests with Benjamini-Hochberg false discovery rate correction identified statistically significant but small-magnitude changes under selected conditions, while McNemar analysis showed no significant increase in failure probability. These findings indicate that SAM exhibits stable segmentation behavior under moderate CT domain shifts, supporting its role as a robust foundation baseline for medical image segmentation research. As health digital twins increasingly incorporate foundation segmentation models for anatomical modeling and organ-level monitoring, formal characterization of robustness under real-world imaging variability is a necessary step toward trustworthy deployment.