Search papers, labs, and topics across Lattice.
The paper introduces Generative Federated Prototype Learning (GFPL) to address challenges in federated learning caused by data imbalance and communication overhead. GFPL uses a Gaussian Mixture Model (GMM) to generate class-wise feature prototypes and aggregates them using Bhattacharyya distance to fuse knowledge across clients, while also generating pseudo-features to balance feature distributions. Experiments on benchmark datasets demonstrate that GFPL improves model accuracy by 3.6% under imbalanced data settings while maintaining low communication costs.
Achieve better accuracy in federated learning with imbalanced data and low communication costs by mimicking the brain's efficient knowledge integration.
Federated learning (FL) facilitates the secure utilization of decentralized images, advancing applications in medical image recognition and autonomous driving. However, conventional FL faces two critical challenges in real-world deployment: ineffective knowledge fusion caused by model updates biased toward majority-class features, and prohibitive communication overhead due to frequent transmissions of high-dimensional model parameters. Inspired by the human brain's efficiency in knowledge integration, we propose a novel Generative Federated Prototype Learning (GFPL) framework to address these issues. Within this framework, a prototype generation method based on Gaussian Mixture Model (GMM) captures the statistical information of class-wise features, while a prototype aggregation strategy using Bhattacharyya distance effectively fuses semantically similar knowledge across clients. In addition, these fused prototypes are leveraged to generate pseudo-features, thereby mitigating feature distribution imbalance across clients. To further enhance feature alignment during local training, we devise a dual-classifier architecture, optimized via a hybrid loss combining Dot Regression and Cross-Entropy. Extensive experiments on benchmarks show that GFPL improves model accuracy by 3.6% under imbalanced data settings while maintaining low communication cost.