AucklandMar 19, 2026arXiv:2603.18678

Words at Play: Benchmarking Audio Pun Understanding in Large Audio-Language Models

Yuchen Su, Shaoxin Zhong, Yonghua Zhu, Ruofan Wang, Zijian Huang, Qiqi Wang, Na Zhao, Diana Benavides-Prado, M. Witbrock, Michael Witbrock

AI Summary

The paper introduces APUN-Bench, a new benchmark for evaluating Large Audio-Language Models (LALMs) on their ability to understand audio puns, encompassing recognition, localization, and meaning inference. Evaluation of 10 state-of-the-art LALMs on the 4,434 sample benchmark reveals significant performance gaps across all three stages of pun understanding. The analysis highlights challenges like positional biases in pun localization and errors in meaning inference, providing insights for future research.

Key Contribution

LALMs still struggle to get the joke, with a new benchmark showing they can't reliably recognize, locate, or understand audio puns.

Abstract

Puns represent a typical linguistic phenomenon that exploits polysemy and phonetic ambiguity to generate humour, posing unique challenges for natural language understanding. Within pun research, audio plays a central role in human communication except text and images, while datasets and systematic resources for spoken puns remain scarce, leaving this crucial modality largely underexplored. In this paper, we present APUN-Bench, the first benchmark dedicated to evaluating large audio language models (LALMs) on audio pun understanding. Our benchmark contains 4,434 audio samples annotated across three stages: pun recognition, pun word location and pun meaning inference. We conduct a deep analysis of APUN-Bench by systematically evaluating 10 state-of-the-art LALMs, uncovering substantial performance gaps in recognizing, localizing, and interpreting audio puns. This analysis reveals key challenges, such as positional biases in audio pun location and error cases in meaning inference, offering actionable insights for advancing humour-aware audio intelligence.

Eval Frameworks & Benchmarks Multimodal Models Speech & Audio

Citation Metrics

Citations0

Influential citations0

References49

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Words at Play: Benchmarking Audio Pun Understanding in Large Audio-Language Models

Related Papers