Search papers, labs, and topics across Lattice.
The paper introduces Audio Hallucination Attacks (AHA), a suite of query-based and audio-based attacks designed to probe the reliability of Large Audio Language Models (LALMs) by inducing hallucinations. Evaluating state-of-the-art LALMs like Audio Flamingo 3 and Gemini 3 Pro using AHA-Eval, the authors found high attack success rates, indicating a significant reliability gap. To address this, they propose AHA-Guard, a post-alignment dataset that reduces attack success rates by up to 49%.
State-of-the-art Large Audio Language Models are surprisingly vulnerable to hallucination attacks, with success rates as high as 95%, revealing a critical reliability gap masked by standard benchmarks.
Large Audio Language Models (LALMs) achieve strong performance on audio-language tasks; however, their reliability in real-world settings remains underexplored. We introduce Audio Hallucination Attacks (AHA), an attack suite called AHA-Eval, comprising 6.5K QA pairs designed to test whether LALMs genuinely ground their responses in the audio input. AHA targets two attack surfaces: (i) query-based attacks, which exploit question structure to induce hallucinations about absent sounds, and (ii) audio-based attacks, which inject synthetic speech describing non-existent events into the audio stream. Evaluating state-of-the-art LALMs, including Audio Flamingo 3 and Gemini 3 Pro, we observe high attack success rates of 95.35% and 79.65%, respectively, revealing a reliability gap that is hidden by standard benchmark performance. To mitigate this, we propose a 120K QA post-alignment dataset, AHA-Guard, which successfully reduces attack success rates by up to 49%.