NTTJun 16, 2026arXiv:2606.17537

Non-Autoregressive Minimum Bayes' Risk Decoding for Fast Speech Recognition

Hiroyuki Deguchi, Takatomo Kano, Katsuki Chousa, Marc Delcroix

AI Summary

This paper introduces a novel non-autoregressive decoding framework, NAR-MBR, which utilizes minimum Bayes' risk to enhance speech recognition performance while maintaining the speed advantages of non-autoregressive methods. By maximizing expected utility from multiple samples drawn from the output probability of an NAR model, the approach effectively mitigates the uncertainty issues inherent in traditional NAR decoding. Experimental results across various datasets, including LibriSpeech and Switchboard, show that NAR-MBR not only surpasses previous NAR methods but also operates faster than autoregressive decoding techniques.

Key Contribution

NAR-MBR decoding achieves superior speech recognition accuracy while being faster than autoregressive methods, redefining efficiency in real-time applications.

Abstract

Non-autoregressive (NAR) decoding generates output tokens in parallel, making speech recognition faster than autoregressive decoding, which generates them sequentially from left to right. However, the recognition performance is degraded because NAR decoding cannot resolve uncertainty by conditioning on previously generated tokens. To address this issue, we propose a novel NAR decoding framework based on minimum Bayes' risk (MBR) decoding, termed NAR-MBR decoding, that maximizes the expected utility calculated from samples drawn from the output probability of an NAR model rather than maximizing the output probability. Notably, by leveraging the nature of NAR models, multiple samples are obtained efficiently with a single forward computation. Our experiments across LibriSpeech, Switchboard, AMI, and web presentation corpus demonstrated that our NAR-MBR decoding outperformed previous NAR decoding and ran faster than AR decoding.

Natural Language Processing Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Non-Autoregressive Minimum Bayes' Risk Decoding for Fast Speech Recognition

Related Papers