Independent ResearcherApr 12, 2026arXiv:2604.10736

BlasBench: An Open Benchmark for Irish Speech Recognition

AI Summary

The authors introduce BlasBench, an open benchmark and evaluation harness for Irish Automatic Speech Recognition (ASR) systems, featuring Irish-aware text normalization. They benchmark 12 ASR systems across four architecture families on Common Voice ga-IE and FLEURS ga-IE datasets, revealing significant performance disparities. The finding that models fine-tuned on Common Voice suffer a 33-43 WER point increase on FLEURS highlights a critical generalization gap often masked by single-dataset evaluations.

Key Contribution

Fine-tuning ASR models on Common Voice can create a false sense of security, with performance on the FLEURS dataset dropping by a staggering 33-43 WER points.

Abstract

No open Irish-specific benchmark compares end-user ASR systems under a shared Irish-aware evaluation protocol. To solve this, we release BlasBench, an open evaluation harness with Irish-aware text normalisation that preserves fadas, lenition, and eclipsis. We benchmark 12 systems across four architecture families on Common Voice ga-IE and FLEURS ga-IE. All Whisper variants exceed 100% WER. The best open model (omniASR LLM 7B) achieves 30.65% WER on Common Voice and 39.09% on FLEURS. We noticed models fine-tuned on Common Voice lose 33-43 WER points on FLEURS, revealing a generalisation gap that is invisible to single-dataset evaluation.

Eval Frameworks & Benchmarks Open-Source Models & Weights Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

BlasBench: An Open Benchmark for Irish Speech Recognition

Related Papers