Apr 23, 2026arXiv:2604.21197

Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

Guilin Deng, Guilin Deng, Silong Chen, Silong Chen, Yuchuan Luo, Yi Liu, Songlei Wang, Songlei Wang, Zhiping Cai, Zhiping Cai, Lin Liu, Xiaohua Jia, X. Jia, Shaojing Fu, Shaojing Fu

AI Summary

This paper introduces ProjRes, a novel passive membership inference attack (MIA) specifically designed for Federated Large Language Models (FedLLMs). ProjRes leverages hidden embedding vectors and analyzes their projection residuals on the gradient subspace to link gradients and inputs, bypassing the limitations of existing MIAs due to the unique properties of FedLLMs. Experiments demonstrate that ProjRes achieves near 100% accuracy across various benchmarks and LLMs, significantly outperforming prior methods and highlighting a critical privacy vulnerability even under strong differential privacy defenses.

Key Contribution

FedLLMs, thought to be safer due to data localization, are shockingly vulnerable: a new attack achieves near 100% membership inference accuracy, even with differential privacy.

Abstract

Federated Large Language Models (FedLLMs) enable multiple parties to collaboratively fine-tune LLMs without sharing raw data, addressing challenges of limited resources and privacy concerns. Despite data localization, shared gradients can still expose sensitive information through membership inference attacks (MIAs). However, FedLLMs'unique properties, i.e. massive parameter scales, rapid convergence, and sparse, non-orthogonal gradients, render existing MIAs ineffective. To address this gap, we propose ProjRes, the first projection residuals-based passive MIA tailored for FedLLMs. ProjRes leverages hidden embedding vectors as sample representations and analyzes their projection residuals on the gradient subspace to uncover the intrinsic link between gradients and inputs. It requires no shadow models, auxiliary classifiers, or historical updates, ensuring efficiency and robustness. Experiments on four benchmarks and four LLMs show that ProjRes achieves near 100% accuracy, outperforming prior methods by up to 75.75%, and remains effective even under strong differential privacy defenses. Our findings reveal a previously overlooked privacy vulnerability in FedLLMs and call for a re-examination of their security assumptions. Our code and data are available at $\href{https://anonymous.4open.science/r/Passive-MIA-5268}{link}$.

Distributed Systems & Hardware Red-Teaming & Adversarial Robustness Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References68

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

Related Papers