Search papers, labs, and topics across Lattice.
This paper introduces MIXGUARD, a mixup-based framework for privacy-preserving split learning in large language models that addresses the trade-offs between utility, privacy, efficiency, and stability. By employing token-level obfuscation, representation-level obfuscation, and adaptive gradient perturbation, MIXGUARD effectively maintains the utility of the model while safeguarding against data reconstruction attacks. Experimental results demonstrate that MIXGUARD achieves utility comparable to non-split training baselines and outperforms existing methods in privacy protection across various tasks and model architectures.
MIXGUARD achieves robust privacy protection in split learning without sacrificing model utility, outperforming existing defenses against advanced data reconstruction attacks.
Split learning provides a practical paradigm for resource-constrained users to train Large Language Models (LLMs) by offloading computation-intensive layers to a server while keeping raw data local. However, existing privacy-preserving split learning methods still face a difficult trade-off among utility, privacy, efficiency, and stability. Specifically, these methods often suffer from substantial utility degradation, remain vulnerable to advanced data reconstruction attacks, incur prohibitive computational and communication overhead, or exhibit unstable performance across different tasks. In this paper, we propose MIXGUARD, a novel mixup-based privacy-preserving split learning framework for LLMs. MIXGUARD introduces token-level obfuscation, representation-level obfuscation, and adaptive gradient perturbation mechanisms, which operate jointly to preserve useful learning signals while preventing privacy leakage to the server. Technically, MIXGUARD first constructs a lightweight calibration model on a public dataset to refine the approximated target representation, and then applies this model during privacy-preserving fine-tuning on private data. We conduct extensive experiments on four classification tasks and four text generation tasks across multiple LLM families, model sizes, architectures, and fine-tuning strategies. The results show that MIXGUARD preserves model utility comparable to non-split training baselines, consistently achieves stronger privacy protection than existing split learning defense methods against state-of-the-art data reconstruction attacks, and remains robust under adaptive attack settings.