Jun 8, 2026arXiv:2606.09145

PrivCode++: Latent-Conditioned Differentially Private Code Generation for Comprehensive Guarantees

Zheng Liu, Chen Gong, Terry Yue Zhuo, Zhou Yang, Kecen Li, Wenlong Meng, Xinwen Hou, Yu Liu, Xiaochen Li

AI Summary

This paper introduces PrivCode++, a novel two-stage differential privacy framework for code generation that treats both prompts and code snippets as sensitive data during fine-tuning. By employing a Privacy-Free Latent Conditioning module, the method enables effective code synthesis without direct access to sensitive information, addressing the limitations of existing approaches that only protect code snippets. Experimental results demonstrate that PrivCode++ significantly enhances utility while maintaining robust privacy guarantees, outperforming baseline methods and remaining competitive even when privacy assumptions are relaxed.

Key Contribution

Code generation can be both private and high-utility, with PrivCode++ achieving superior results by safeguarding both prompts and snippets.

Abstract

Large language models fine-tuned on instruction-code pairs may memorize and subsequently leak sensitive training data. Existing differentially private (DP) code generation methods primarily protect code snippets while assuming prompts are public, which fails in realistic scenarios where prompts may also contain sensitive information. When prompts cannot be explicitly learned or used during generation, code synthesis suffers from severe utility degradation as well as reduced diversity and fidelity. To address these challenges, we propose PrivCode-Plus, the first work to explore DP code generation where both prompts and code snippets are considered sensitive in LLM fine-tuning. PrivCode-Plus introduces a two-stage DP framework with a Privacy-Free Latent Conditioning module, enabling effective DP fine-tuning and data synthesis without direct access to sensitive prompts or code. Extensive experiments show that PrivCode-Plus achieves substantially higher utility than baselines, remains competitive with the method with relaxing privacy assumptions, and provides stronger privacy guarantees.

Code Generation & Program Synthesis Constitutional AI & AI Ethics

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

PrivCode++: Latent-Conditioned Differentially Private Code Generation for Comprehensive Guarantees

Related Papers