Jun 1, 2026arXiv:2606.01856

Boosting Multimodal Federated Learning via Chained Modality Optimization

Zixin Zhang, Fan Qi, Shuai Li, Xiaoshan Yang, Changsheng Xu

AI Summary

This paper introduces FedMChain, a novel framework for Multimodal Federated Learning (MMFL) that addresses the issue of modality competition by structuring training into modality-wise phases. By providing each modality with a dedicated optimization window and employing an error-compensated regularizer, FedMChain enhances cross-modal complementarity and improves overall model performance. Experimental results show that FedMChain not only boosts predictive accuracy but also reduces communication frequency compared to existing methods.

Key Contribution

Modality competition in federated learning can be mitigated by structuring training into dedicated phases, leading to significant performance gains and reduced communication overhead.

Abstract

Multimodal Federated Learning (MMFL) enables privacy-preserving collaborative learning across decentralized clients with heterogeneous data and modality availability. However, most existing MMFL methods cast multimodal training as a joint optimization problem, overlooking a key bottleneck: modality competition, where dominant modalities suppress weaker ones and lead to suboptimal global models. To address this, we propose FedMChain, a balanced MMFL framework that structures federated multimodal training as a chain of modality-wise phases. This phase-wise design gives each modality a dedicated local optimization window on multimodal clients to mitigate modality competition, and further promotes cross-modal complementarity via an error-compensated regularizer. On the server side, we employ a sparse sign-guided aggregation strategy that leverages directional sign agreement for robust intra-modality aggregation, avoids destructive averaging, and supports less frequent synchronization to reduce communication overhead. Extensive experiments on multimodal benchmarks demonstrate that FedMChain consistently improves predictive performance while requiring less frequent communication than baselines.

Multimodal Models

Citation Metrics

Citations0

Influential citations0

References48

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Boosting Multimodal Federated Learning via Chained Modality Optimization

Related Papers