Search papers, labs, and topics across Lattice.
This paper evaluates the integration of differential privacy (DP) and homomorphic encryption (HE) into federated learning (FL) for cardiovascular disease risk prediction using real-world Swedish healthcare data. They compare FL with DP and HE to standard FL and centralized ML (cML) using logistic regression (LR) and neural network (NN) models to quantify privacy-utility trade-offs. Results show that FL with HE achieves performance comparable to cML with cryptographic overhead, while FL with DP incurs lower computational cost but greater performance degradation, particularly for LR.
Homomorphic encryption can make federated learning nearly as accurate as centralized training on sensitive healthcare data, but at a steep computational cost, while differential privacy offers a less expensive but accuracy-sacrificing alternative.
Protecting sensitive health data while enabling collaborative analysis is a central challenge in healthcare. Traditional machine learning (ML) requires institutions to pool anonymized patient records, centralizing analytical development and privacy risks at a single site. Privacy-enhancing technologies (PETs), including Differential Privacy (DP) and Homomorphic Encryption (HE), can mitigate these risks. However, they are mainly studied in conventional data-sharing settings and often introduce trade-offs, including reduced model utility, higher computational cost, and increased implementation complexity. Federated Learning (FL) reduces data centralization by enabling institutions to train models locally and share only model updates. Nevertheless, FL does not eliminate privacy risks, as shared parameters or gradients may still reveal sensitive information. Integrating DP or HE into FL can strengthen privacy guarantees, yet their comparative performance and deployment implications in real-world healthcare settings remain unclear. We systematically evaluated DP and HE integration in FL under real-world conditions, comparing them with standard FL and centralized ML (cML) to quantify privacy-utility trade-offs in multi-institutional settings. Using nationwide Swedish healthcare data, we evaluated cardiovascular disease risk prediction using logistic regression (LR) and neural network (NN) learners. FL with HE achieved performance comparable to cML but introduced measurable cryptographic overhead, particularly in the NN implementation. FL with DP incurred lower computational cost; however, LR was more sensitive to calibrated noise than the NN, resulting in greater performance degradation. Our findings provide practical guidance for deploying privacy-preserving FL in fragmented healthcare systems.