Search papers, labs, and topics across Lattice.
This paper distills the privacy assessment capabilities of the 675B Mistral Large 3 model into smaller encoder models (as small as 150M parameters). They train these smaller models on a large dataset of privacy-annotated texts across 10 domains, significantly reducing computational costs while maintaining strong agreement with human privacy judgments. The distilled models are validated on human-annotated test data and shown to be useful for evaluating de-identification systems.
You can shrink a privacy expert LLM by 4500x and still get human-level privacy judgments.
Accurate privacy evaluation of textual data remains a critical challenge in privacy-preserving natural language processing. Recent work has shown that large language models (LLMs) can serve as reliable privacy evaluators, achieving strong agreement with human judgments; however, their computational cost and impracticality for processing sensitive data at scale limit real-world deployment. We address this gap by distilling the privacy assessment capabilities of Mistral Large 3 (675B) into lightweight encoder models with as few as 150M parameters. Leveraging a large-scale dataset of privacy-annotated texts spanning 10 diverse domains, we train efficient classifiers that preserve strong agreement with human annotations while dramatically reducing computational requirements. We validate our approach on human-annotated test data and demonstrate its practical utility as an evaluation metric for de-identification systems.