Search papers, labs, and topics across Lattice.
This paper introduces FedMChain, a novel framework for Multimodal Federated Learning (MMFL) that addresses the issue of modality competition by structuring training into modality-wise phases. By providing each modality with a dedicated optimization window and employing an error-compensated regularizer, FedMChain enhances cross-modal complementarity and improves overall model performance. Experimental results show that FedMChain not only boosts predictive accuracy but also reduces communication frequency compared to existing methods.
Modality competition in federated learning can be mitigated by structuring training into dedicated phases, leading to significant performance gains and reduced communication overhead.
Multimodal Federated Learning (MMFL) enables privacy-preserving collaborative learning across decentralized clients with heterogeneous data and modality availability. However, most existing MMFL methods cast multimodal training as a joint optimization problem, overlooking a key bottleneck: modality competition, where dominant modalities suppress weaker ones and lead to suboptimal global models. To address this, we propose FedMChain, a balanced MMFL framework that structures federated multimodal training as a chain of modality-wise phases. This phase-wise design gives each modality a dedicated local optimization window on multimodal clients to mitigate modality competition, and further promotes cross-modal complementarity via an error-compensated regularizer. On the server side, we employ a sparse sign-guided aggregation strategy that leverages directional sign agreement for robust intra-modality aggregation, avoids destructive averaging, and supports less frequent synchronization to reduce communication overhead. Extensive experiments on multimodal benchmarks demonstrate that FedMChain consistently improves predictive performance while requiring less frequent communication than baselines.