Search papers, labs, and topics across Lattice.
This paper presents a systematic study of automatic speech recognition (ASR) performance on speech from individuals with Huntington's Disease (HD), using a previously unused clinical speech corpus. They compare various ASR architectures, finding that Parakeet-TDT performs best, and demonstrate that HD-specific adaptation further reduces word error rate (WER). The authors also introduce a biomarker-based auxiliary supervision method, showing that it reshapes error behavior in severity-dependent ways, rather than uniformly improving WER.
Adapting ASR models to Huntington's Disease speech not only improves accuracy, but also reveals how biomarker-based supervision can reshape error patterns in ways that reflect disease severity.
Automatic speech recognition (ASR) for pathological speech remains underexplored, especially for Huntington's disease (HD), where irregular timing, unstable phonation, and articulatory distortion challenge current models. We present a systematic HD-ASR study using a high-fidelity clinical speech corpus not previously used for end-to-end ASR training. We compare multiple ASR families under a unified evaluation, analyzing WER as well as substitution, deletion, and insertion patterns. HD speech induces architecture-specific error regimes, with Parakeet-TDT outperforming encoder-decoder and CTC baselines. HD-specific adaptation reduces WER from 6.99% to 4.95% and we also propose a method for using biomarker-based auxiliary supervision and analyze how error behavior is reshaped in severity-dependent ways rather than uniformly improving WER. We open-source all code and models.