Universitat Rovira i VirgiliMar 8, 2026arXiv:2603.07567

Revisiting the LiRA Membership Inference Attack Under Realistic Assumptions

Najeeb Jebreel, Mona Khalil, David Sánchez, Josep Domingo-Ferrer

AI Summary

This paper re-evaluates the Likelihood-Ratio Attack (LiRA) for membership inference under more realistic assumptions about model training and attacker capabilities. They incorporate anti-overfitting techniques, transfer learning, shadow-based threshold calibration, and skewed membership priors to assess LiRA's effectiveness. The results show that anti-overfitting and transfer learning significantly weaken LiRA's performance, and that previous evaluations overstated its effectiveness due to unrealistic assumptions.

Key Contribution

Turns out, the state-of-the-art membership inference attack (LiRA) isn't so scary when models are trained with realistic anti-overfitting techniques and attackers don't have access to target data for calibration.

Abstract

Membership inference attacks (MIAs) have become the standard tool for evaluating privacy leakage in machine learning (ML). Among them, the Likelihood-Ratio Attack (LiRA) is widely regarded as the state of the art when sufficient shadow models are available. However, prior evaluations have often overstated the effectiveness of LiRA by attacking models overconfident on their training samples, calibrating thresholds on target data, assuming balanced membership priors, and/or overlooking attack reproducibility. We re-evaluate LiRA under a realistic protocol that (i) trains models using anti-overfitting (AOF) and transfer learning (TL), when applicable, to reduce overconfidence as in production models; (ii) calibrates decision thresholds using shadow models and data rather than target data; (iii) measures positive predictive value (PPV, or precision) under shadow-based thresholds and skewed membership priors (pi <= 10%); and (iv) quantifies per-sample membership reproducibility across different seeds and training variations. We find that AOF significantly weakens LiRA, while TL further reduces attack effectiveness while improving model accuracy. Under shadow-based thresholds and skewed priors, LiRA's PPV often drops substantially, especially under AOF or AOF+TL. We also find that thresholded vulnerable sets at extremely low FPR show poor reproducibility across runs, while likelihood-ratio rankings are more stable. These results suggest that LiRA, and likely weaker MIAs, are less effective than previously suggested under realistic conditions, and that reliable privacy auditing requires evaluation protocols that reflect practical training practices, feasible attacker assumptions, and reproducibility considerations. Code is available at https://github.com/najeebjebreel/lira_analysis.

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Revisiting the LiRA Membership Inference Attack Under Realistic Assumptions

Related Papers