Search papers, labs, and topics across Lattice.
This paper analyzes the reliability of efficient membership inference attack (MIA) evaluation pipelines, commonly used to assess data leakage in machine learning models. It identifies two key weaknesses: a lack of calibration in TPR estimation when MIA scores are concatenated across individuals, and a finite population bias in the efficient likelihood-ratio attack (LiRA) implementation. To address the calibration issue, the authors propose a post-processing method to effectively calibrate the FPR across different samples, improving the reliability of MIA evaluations.
Averaging membership inference scores across multiple individuals to reduce compute can lead to unreliable vulnerability assessments due to uncalibrated false positive rates.
Membership inference attacks (MIAs) are popular methods for empirically assessing the leakage of sensitive information in the training data through models or statistics learned from the data. The MIA vulnerability is often evaluated through false positive rate (FPR) and true positive rate (TPR) of a binary classifier that tries to predict whether a particular sample was in the training data. However, in order to reliably estimate the TPR especially for low FPR values, a lot of observations are needed, which in case of MIA translates to many target models, leading to large computational cost. To avoid excessive compute requirements, the MIA scores are often averaged over multiple individuals and multiple targeted models. We demonstrate two key weaknesses in this efficient MIA evaluation pipeline. First, we show that evaluating the TPR based on MIA scores concatenated across multiple individuals, commonly used to study vulnerabilities in the very low FPR regime, is not calibrated across the per-sample FPRs. This makes it unreliable as a tool for auditing differential privacy. To solve this, we propose a post-processing method to effectively calibrate the FPR across different samples. Second, we identify a finite population bias in the commonly used efficient likelihood-ratio attack (LiRA) implementation proposed by Carlini et al. 2022, leading to a positive bias in the per-sample vulnerability.