Search papers, labs, and topics across Lattice.
The paper addresses the scalability challenge in Personalized Federated Learning (PFL) by reformulating it as a few-for-many optimization problem, where $K$ shared server models serve $M$ clients ($K \ll M$). They theoretically prove near-optimal personalization with this framework, showing that approximation error decreases as $K$ increases and each client's model converges to its optimum as data grows. Based on this, they propose FedFew, an algorithm that jointly optimizes the $K$ server models through gradient-based updates and demonstrate its superior performance compared to existing PFL methods on vision, NLP, and medical imaging datasets.
Ditch the complexity of one-model-per-client in federated learning: FedFew achieves state-of-the-art personalization with just a handful of shared models, outperforming existing methods without manual clustering or hyperparameter tuning.
Personalized Federated Learning (PFL) aims to train customized models for clients with highly heterogeneous data distributions while preserving data privacy. Existing approaches often rely on heuristics like clustering or model interpolation, which lack principled mechanisms for balancing heterogeneous client objectives. Serving $M$ clients with distinct data distributions is inherently a multi-objective optimization problem, where achieving optimal personalization ideally requires $M$ distinct models on the Pareto front. However, maintaining $M$ separate models poses significant scalability challenges in federated settings with hundreds or thousands of clients. To address this challenge, we reformulate PFL as a few-for-many optimization problem that maintains only $K$ shared server models ($K \ll M$) to collectively serve all $M$ clients. We prove that this framework achieves near-optimal personalization: the approximation error diminishes as $K$ increases and each client's model converges to each client's optimum as data grows. Building on this reformulation, we propose FedFew, a practical algorithm that jointly optimizes the $K$ server models through efficient gradient-based updates. Unlike clustering-based approaches that require manual client partitioning or interpolation-based methods that demand careful hyperparameter tuning, FedFew automatically discovers the optimal model diversity through its optimization process. Experiments across vision, NLP, and real-world medical imaging datasets demonstrate that FedFew, with just 3 models, consistently outperforms other state-of-the-art approaches. Code is available at https://github.com/pgg3/FedFew.