Search papers, labs, and topics across Lattice.
This paper investigates the impact of differentially private (DP) text rewriting on linguistic style. They find that DP rewriting, whether autoregressive or bidirectional, systematically alters the text's communicative signature, leading to a loss of interactive markers, contextual references, and complex subordination. The study demonstrates that DP mechanisms, while preserving semantic content, homogenize stylistic markers, pushing text towards a non-involved and non-persuasive register.
Differential privacy doesn't just change the words you use, it fundamentally reshapes your writing style, stripping away the nuances that make it human.
Differential Privacy (DP) for text matured from disjointed word-level substitutions to contiguous sentence-level rewriting by leveraging the generative capacity of language models. While this form of text privatization is best suited for balancing formal privacy guarantees with grammatical coherence, its impact on the register identity of text remains largely unexplored. By conducting a multidimensional stylistic profiling of differentially-private rewriting, we demonstrate that the cost of privacy extends far beyond lexical variation. Specifically, we find that rewriting under privacy constraints induces a systematic functional mutation of the text's communicative signature. This shift is characterized by the severe attrition of interactive markers, contextual references, and complex subordination. By comparing autoregressive paraphrasing against bidirectional substitution across a spectrum of privacy budgets, we observe that both architectures force convergence toward a non-involved and non-persuasive register. This register-blind sanitization effectively preserves semantic content but structurally homogenizes the nuanced stylistic markers that define human-authored discourse.