CMU MLUT AustinApr 2, 2026arXiv:2604.02102

Prosodic ABX: A Language-Agnostic Method for Measuring Prosodic Contrast in Speech Representations

Haitong Sun, Stephen McIntosh, Stephen McIntosh, Kwanghee Choi, Eunjung Yeo, Daisuke Saito, Daisuke Saito, N. Minematsu, Nobuaki Minematsu

AI Summary

The paper introduces "prosodic ABX," a novel adaptation of the ABX discrimination task, to quantify the sensitivity of self-supervised speech model (S3M) representations to prosodic contrasts (stress, pitch accent, tone) without requiring explicit labels. They construct and release a dataset of English and Japanese minimal pairs and use it with a Mandarin dataset to evaluate prosodic contrast. Results demonstrate the method's effectiveness across languages and show that model and layer rankings are consistent across experimental conditions, making it suitable for low-resource scenarios.

Key Contribution

Turns out, you can measure how well speech models capture subtle prosodic differences like stress and tone using just a few unlabeled examples.

Abstract

Speech representations from self-supervised speech models (S3Ms) are known to be sensitive to phonemic contrasts, but their sensitivity to prosodic contrasts has not been directly measured. The ABX discrimination task has been used to measure phonemic contrast in S3M representations via minimal pairs. We introduce prosodic ABX, an extension of this framework to evaluate prosodic contrast with only a handful of examples and no explicit labels. Also, we build and release a dataset of English and Japanese minimal pairs and use it along with a Mandarin dataset to evaluate contrast in English stress, Japanese pitch accent, and Mandarin tone. Finally, we show that model and layer rankings are often preserved across several experimental conditions, making it practical for low-resource settings.

Eval Frameworks & Benchmarks Natural Language Processing Speech & Audio

Citation Metrics

Citations0

Influential citations0

References41

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Prosodic ABX: A Language-Agnostic Method for Measuring Prosodic Contrast in Speech Representations

Related Papers