Search papers, labs, and topics across Lattice.
This paper introduces a benchmark suite designed to evaluate federated noisy label learning (FNLL) for medical image segmentation, addressing the challenges posed by real-world label noise such as contour disagreements and mislabeling. By integrating diverse real-world datasets and client-noise scenarios, the suite enables systematic assessment of FNLL methods, which have been underutilized due to previous reliance on synthetic noise. The findings establish a robust framework for fair benchmarking and informed method selection in federated learning contexts, paving the way for improved performance in real-world medical applications.
Real-world label noise can significantly hinder federated learning in medical image segmentation, but our benchmark suite offers a comprehensive solution for evaluating FNLL methods under realistic conditions.
While federated learning (FL) enables collaborative medical image segmentation without centralizing sensitive data, real-world deployment is frequently complicated by cross-site label imperfections such as contour disagreement, missing or additional structures, and confused labels. Federated noisy label learning (FNLL) aims to mitigate these effects, yet remains underused in practice as existing evidence is largely based on synthetic noise, simplified settings, and limited real-world noisy evaluation. We address this gap by introducing a benchmark suite that combines diverse real-world noisy datasets, deployment-relevant client-noise scenarios, and label-noise-targeted evaluation to support systematic FNLL assessment and informed method selection. The suite combines curated real-world noisy medical image segmentation datasets from diverse sources with a comprehensive federated segmentation framework including various client-noise scenarios and noise-targeted evaluation. The presented suite provides a realistic and discriminative basis for FNLL evaluation in medical image segmentation and establishes a reusable foundation for fair benchmarking, dataset-specific label-noise characterization, and future method development under realistic federated settings. Code is available at https://github.com/MIC-DKFZ/FedSegNoiseBench.