KITMay 27, 2026arXiv:2605.28211

When Helpful Context Leaks: Privacy Risks in Domain-Adapted ASR

AI Summary

This paper investigates privacy risks in domain-adapted SpeechLLMs, showing that models can be induced to transcribe phonetically similar words from context or training data, leaking private information. The authors construct a controlled dataset to measure leakage rates across prompting and fine-tuning, finding both mechanisms cause leakage, especially when combined. They evaluate a prompt-level mitigation strategy and analyze the accuracy-leakage trade-off, suggesting fine-tuning without context prompts offers the best balance.

Key Contribution

Domain-adapted SpeechLLMs can be tricked into revealing sensitive information by transcribing phonetically similar words from their context or training data, even when a different word is spoken.

Abstract

SpeechLLMs are increasingly deployed in professional settings where domain customisation is standard practice: users supply context in prompts with sensitive information, fine-tune on proprietary recordings, or both. We identify and systematically investigate an overlooked privacy risk of such customisation: a model adapted to recognise domain-specific terminology can be nudged into transcribing a phonetically similar word from its context or training data, even when a different word is spoken, thereby leaking private information. To evaluate this risk, we construct a controlled dataset and measure leakage rates across two customisation mechanisms, prompting and fine-tuning. Both mechanisms cause measurable leakage, compounding when combined. We evaluate a prompt-level mitigation strategy and analyse the accuracy-leakage trade-off across customisation approaches, finding that fine-tuning without context prompts offers the best balance. We release our code and dataset publicly.

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

When Helpful Context Leaks: Privacy Risks in Domain-Adapted ASR

Related Papers