Search papers, labs, and topics across Lattice.
This paper investigates privacy risks in domain-adapted SpeechLLMs, showing that models can be induced to transcribe phonetically similar words from context or training data, leaking private information. The authors construct a controlled dataset to measure leakage rates across prompting and fine-tuning, finding both mechanisms cause leakage, especially when combined. They evaluate a prompt-level mitigation strategy and analyze the accuracy-leakage trade-off, suggesting fine-tuning without context prompts offers the best balance.
Domain-adapted SpeechLLMs can be tricked into revealing sensitive information by transcribing phonetically similar words from their context or training data, even when a different word is spoken.
SpeechLLMs are increasingly deployed in professional settings where domain customisation is standard practice: users supply context in prompts with sensitive information, fine-tune on proprietary recordings, or both. We identify and systematically investigate an overlooked privacy risk of such customisation: a model adapted to recognise domain-specific terminology can be nudged into transcribing a phonetically similar word from its context or training data, even when a different word is spoken, thereby leaking private information. To evaluate this risk, we construct a controlled dataset and measure leakage rates across two customisation mechanisms, prompting and fine-tuning. Both mechanisms cause measurable leakage, compounding when combined. We evaluate a prompt-level mitigation strategy and analyse the accuracy-leakage trade-off across customisation approaches, finding that fine-tuning without context prompts offers the best balance. We release our code and dataset publicly.