ByteDanceZJUApr 27, 2026arXiv:2604.24468

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Zihan Liu, Yizhen Wang, Rui Wang, Xiu Tang, Sai Wu

AI Summary

This survey paper examines the emerging field of split learning for fine-tuning LLMs, addressing the challenge of resource constraints and data privacy when adapting large models. It proposes a unified training pipeline to categorize existing research across model optimization, system efficiency, and privacy preservation techniques. The survey identifies key challenges and opportunities for scalable, robust, and secure collaborative LLM adaptation, providing a structured overview of the current state-of-the-art.

Key Contribution

Split learning offers a surprisingly viable path to fine-tuning LLMs on sensitive data without breaking the bank or sacrificing privacy.

Abstract

Fine-tuning unlocks large language models (LLMs) for specialized applications, but its high computational cost often puts it out of reach for resource-constrained organizations. While cloud platforms could provide the needed resources, data privacy concerns make sharing sensitive information with third parties risky. A promising solution is split learning for LLM fine-tuning, which divides the model between clients and a server, allowing collaborative and secure training through exchanged intermediate data, thus enabling resource-constrained participants to adapt LLMs safely. % In light of this, a growing body of literature has emerged to advance this paradigm, introducing varied model methods, system optimizations, and privacy defense-attack techniques for split learning. To bring clarity and direction to the field, a comprehensive survey is needed to classify, compare, and critique these diverse approaches. This paper fills the gap by presenting the first extensive survey dedicated to split learning for LLM fine-tuning. We propose a unified, fine-grained training pipeline to pinpoint key operational components and conduct a systematic review of state-of-the-art work across three core dimensions: model-level optimization, system-level efficiency, and privacy preservation. Through this structured taxonomy, we establish a foundation for advancing scalable, robust, and secure collaborative LLM adaptation.

Distributed Systems & Hardware Natural Language Processing Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Survey on Split Learning for LLM Fine-Tuning: Models, Systems, and Privacy Optimizations

Related Papers