National Institute for Data Science in Health and MedicineSchool of InformaticsTencent AIWestlakeMar 5, 2026arXiv:2603.04887

Federated modality-specific encoders and partially personalized fusion decoder for multimodal brain tumor segmentation

Hong Liu, Dong Wei, Qian Dai, Qi Dai, Xian Wu, Yefeng Zheng, Liansheng Wang

AI Summary

This paper introduces FedMEPD, a federated learning framework designed to handle both intermodal and intramodal heterogeneity in multimodal medical image analysis, specifically brain tumor segmentation. The framework employs modality-specific encoders trained via federated learning and partially personalized decoders to cater to individual participant data characteristics. A server with full-modal data uses a fusion decoder to bridge modalities and optimize encoders, while clients with incomplete modalities calibrate missing-modal representations using cross-attention with global full-modal anchors. Experiments on BraTS 2018 and 2020 datasets demonstrate FedMEPD's superior performance compared to existing multimodal and personalized FL methods.

Key Contribution

A new federated learning framework tackles the challenge of training a global brain tumor segmentation model when different hospitals have access to different MRI modalities.

Abstract

Most existing federated learning (FL) methods for medical image analysis only considered intramodal heterogeneity, limiting their applicability to multimodal imaging applications. In practice, some FL participants may possess only a subset of the complete imaging modalities, posing intermodal heterogeneity as a challenge to effectively training a global model on all participants' data. Meanwhile, each participant expects a personalized model tailored to its local data characteristics in FL. This work proposes a new FL framework with federated modality-specific encoders and partially personalized multimodal fusion decoders (FedMEPD) to address the two concurrent issues. Specifically, FedMEPD employs an exclusive encoder for each modality to account for the intermodal heterogeneity. While these encoders are fully federated, the decoders are partially personalized to meet individual needs-using the discrepancy between global and local parameter updates to dynamically determine which decoder filters are personalized. Implementation-wise, a server with full-modal data employs a fusion decoder to fuse representations from all modality-specific encoders, thus bridging the modalities to optimize the encoders via backpropagation. Moreover, multiple anchors are extracted from the fused multimodal representations and distributed to the clients in addition to the model parameters. Conversely, the clients with incomplete modalities calibrate their missing-modal representations toward the global full-modal anchors via scaled dot-product cross-attention, making up for the information loss due to absent modalities. FedMEPD is validated on the BraTS 2018 and 2020 multimodal brain tumor segmentation benchmarks. Results show that it outperforms various up-to-date methods for multimodal and personalized FL, and its novel designs are effective.

Computer Vision Distributed Systems & Hardware Multimodal Models

Citation Metrics

Citations4

Influential citations0

References63

Year2025

VenueMedical Image Anal.

Related Papers

Finding related papers...

Search

Federated modality-specific encoders and partially personalized fusion decoder for multimodal brain tumor segmentation

Related Papers