Search papers, labs, and topics across Lattice.
This paper investigates the use of Speech Articulatory Coding (SPARC) features to predict surface electromyography (sEMG) envelopes across aloud, mimed, and subvocal speech using elastic-net multivariate temporal response function (mTRF). SPARC features outperform phoneme one-hot representations in predicting sEMG activity across all speech modes and most electrodes. Variance partitioning highlights SPARC's unique contribution, establishing it as a robust and interpretable intermediate target for silent-speech modeling.
SPARC features unlock more accurate and interpretable sEMG-based silent speech modeling compared to traditional phoneme representations.
We test whether Speech Articulatory Coding (SPARC) features can linearly predict surface electromyography (sEMG) envelopes across aloud, mimed, and subvocal speech in twenty-four subjects. Using elastic-net multivariate temporal response function (mTRF) with sentence-level cross-validation, SPARC yields higher prediction accuracy than phoneme one-hot representations on nearly all electrodes and in all speech modes. Aloud and mimed speech perform comparably, and subvocal speech remains above chance, indicating detectable articulatory activity. Variance partitioning shows a substantial unique contribution from SPARC and a minimal unique contribution from phoneme features. mTRF weight patterns reveal anatomically interpretable relationships between electrode sites and articulatory movements that remain consistent across modes. This study focuses on representation/encoding analysis (not end-to-end decoding) and supports SPARC as a robust and interpretable intermediate target for sEMG-based silent-speech modeling.