RadboudMar 10, 2026arXiv:2603.09725

A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition

Dimme de Groot, Yuanyuan Zhang, Jorge Martinez, Odette Scharenborg

AI Summary

The authors introduce DRES, a 1.5-hour Dutch speech dataset recorded from 80 speakers in noisy, public indoor environments using a four-channel linear microphone array. The dataset is designed to evaluate speech enhancement (SE) and automatic speech recognition (ASR) models in realistic conditions. Experiments using DRES revealed that while some ASR models achieve WERs below 22%, single-channel SE algorithms do not consistently improve ASR performance, highlighting the need for realistic evaluation scenarios.

Key Contribution

Modern speech enhancement algorithms may not improve ASR performance in realistic noisy environments, challenging assumptions about their effectiveness in real-world applications.

Abstract

We present DRES: a 1.5-hour Dutch realistic elicited (semi-spontaneous) speech dataset from 80 speakers recorded in noisy, public indoor environments. DRES was designed as a test set for the evaluation of state-of-the-art (SOTA) automatic speech recognition (ASR) and speech enhancement (SE) models in a real-world scenario: a person speaking in a public indoor space with background talkers and noise. The speech was recorded with a four-channel linear microphone array. In this work we evaluate the speech quality of five well-known single-channel SE algorithms and the recognition performance of eight SOTA off-the-shelf ASR models before and after applying SE on the speech of DRES. We found that five out of the eight ASR models have WERs lower than 22% on DRES, despite the challenging conditions. In contrast to recent work, we did not find a positive effect of modern single-channel SE on ASR performance, emphasizing the importance of evaluating in realistic conditions.

Data Curation & Synthetic Data Natural Language Processing Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Semi-spontaneous Dutch Speech Dataset for Speech Enhancement and Speech Recognition

Related Papers