Search papers, labs, and topics across Lattice.
This paper presents a comprehensive, system-level characterization of MPC and FHE for privacy-preserving machine learning (PPML) inference across CNN and Transformer models. It evaluates two MPC variants (arithmetic/binary sharing conversion and function secret sharing) and FHE under various LAN/WAN environments, model sizes, batch sizes, and sequence lengths, considering both online and offline overheads. The study provides empirical guidance on selecting, optimizing, and deploying these PPML paradigms, highlighting how hardware and network trends impact the trade-offs between MPC and FHE.
Choosing the right privacy-preserving ML technique is more than just latency: this benchmark reveals the hidden energy and monetary costs that can make MPC cheaper than FHE in some surprising scenarios.
Privacy protection has become an increasing concern in modern machine learning applications. Privacy-preserving machine learning (PPML) has attracted growing research attention, with approaches such as secure multiparty computation (MPC) and fully homomorphic encryption (FHE) being actively explored. However, existing evaluations of these approaches have frequently been done on a narrow, fragmented setup and only focused on a specific performance metric, such as the online inference latency of a specific batch size. From the existing reports, it is hard to compare different approaches, especially when considering other metrics like energy/cost or broader system setups (various hyperparameters, offline overheads, future hardware/network configurations, etc.). We present a unified characterization of three popular approaches -- two variants of MPC based on arithmetic/binary sharing conversion and function secret sharing, and FHE -- on their performance and cost in performing privacy-preserving inference on multiple CNN and Transformer models. We study a range of LAN and WAN environments, model sizes, batch sizes, and input sequence lengths. We evaluate not only the performance but also the energy consumption and monetary cost of deploying under a realistic scenario, taking into account their offline and online computation/communication overheads. We provide empirical guidance for selecting, optimizing, and deploying these privacy-preserving compute paradigms, and outline how evolving hardware and network trends are likely to shift trade-offs between the two MPC schemes and FHE. This work provides system-level insights for researchers and practitioners who seek to understand or accelerate PPML workloads.