JHUApr 29, 2026arXiv:2604.26281

DiffAnon: Diffusion-based Prosody Control for Voice Anonymization

Ismail Rasim Ulgen, Ismail Rasim Ulgen, Zexin Cai, Nicholas Andrews, Philipp Koehn, Philipp Koehn, Berrak Sisman

AI Summary

This paper introduces DiffAnon, a diffusion-based voice anonymization method leveraging classifier-free guidance to offer continuous control over prosody preservation during inference. By refining acoustic details over semantic embeddings from an RVQ codec, DiffAnon enables smooth interpolation between anonymization strength and prosodic fidelity. Experiments demonstrate a structured trade-off between utility and privacy, achieving competitive privacy while preserving prosodic information at controllable operating points.

Key Contribution

Finally, voice anonymization offers a smooth, tunable knob to balance privacy and prosody, instead of forcing you to pick just one.

Abstract

To preserve or not to preserve prosody is a central question in voice anonymization. Prosody conveys meaning and affect, yet is tightly coupled with speaker identity. Existing methods either discard prosody for privacy or lack a principled mechanism to control the utility-privacy trade-off, operating at fixed design points. We propose DiffAnon, a diffusion-based anonymization method with classifier-free guidance (CFG) that provides explicit, continuous inference-time control over prosody preservation. DiffAnon refines acoustic detail over semantic embeddings of an RVQ codec, enabling smooth interpolation between anonymization strength and prosodic fidelity within a single model. To the best of our knowledge, it is the first voice anonymization framework to provide structured, interpolatable inference-time prosody control. Experiments demonstrate structured trade-off behavior, achieving strong utility while maintaining competitive privacy across controllable operating points.

Natural Language Processing Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

DiffAnon: Diffusion-based Prosody Control for Voice Anonymization

Related Papers