HornetsecurityFeb 24, 2026arXiv:2602.20743

Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

Gabriel Loiseau, Gabriel Loiseau, Damien Sileo, D. Sileo, D. Riquet, Damien Riquet, Maxime Meyer, Maxime Meyer, Marc Tommasi, Marc Tommasi

AI Summary

This paper introduces adaptive text anonymization, a novel task formulation that automatically adapts anonymization strategies to specific privacy-utility requirements. They propose a framework for task-specific prompt optimization that constructs anonymization instructions for language models, enabling adaptation to different privacy goals, domains, and downstream usage patterns. Experiments across five datasets demonstrate that their framework achieves a better privacy-utility trade-off than existing baselines while remaining computationally efficient and effective on open-source language models.

Key Contribution

Forget hand-crafted rules: this work learns to prompt language models for optimal text anonymization, adapting to diverse privacy needs and outperforming static methods.

Abstract

Anonymizing textual documents is a highly context-sensitive problem: the appropriate balance between privacy protection and utility preservation varies with the data domain, privacy objectives, and downstream application. However, existing anonymization methods rely on static, manually designed strategies that lack the flexibility to adjust to diverse requirements and often fail to generalize across domains. We introduce adaptive text anonymization, a new task formulation in which anonymization strategies are automatically adapted to specific privacy-utility requirements. We propose a framework for task-specific prompt optimization that automatically constructs anonymization instructions for language models, enabling adaptation to different privacy goals, domains, and downstream usage patterns. To evaluate our approach, we present a benchmark spanning five datasets with diverse domains, privacy constraints, and utility objectives. Across all evaluated settings, our framework consistently achieves a better privacy-utility trade-off than existing baselines, while remaining computationally efficient and effective on open-source language models, with performance comparable to larger closed-source models. Additionally, we show that our method can discover novel anonymization strategies that explore different points along the privacy-utility trade-off frontier.

Constitutional AI & AI Ethics Data Curation & Synthetic Data Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References32

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Adaptive Text Anonymization: Learning Privacy-Utility Trade-offs via Prompt Optimization

Related Papers