Search papers, labs, and topics across Lattice.
The paper introduces iMiGUE-Speech, a spontaneous affective speech dataset derived from real match outcomes, featuring speech transcripts, speaker roles, and word-level alignments. This dataset addresses the limitations of existing emotional speech datasets by capturing naturally occurring affect, rather than acted or elicited emotions. The authors establish baseline performance on speech emotion recognition and transcript-based sentiment analysis tasks using pretrained models, demonstrating the dataset's utility for affective analysis.
Finally, a speech emotion dataset that captures *real* spontaneous affect, not just acted emotions, opening the door to more realistic models.
This work presents iMiGUE-Speech, an extension of the iMiGUE dataset that provides a spontaneous affective corpus for studying emotional and affective states. The new release focuses on speech and enriches the original dataset with additional metadata, including speech transcripts, speaker-role separation between interviewer and interviewee, and word-level forced alignments. Unlike existing emotional speech datasets that rely on acted or laboratory-elicited emotions, iMiGUE-Speech captures spontaneous affect arising naturally from real match outcomes. To demonstrate the utility of the dataset and establish initial benchmarks, we introduce two evaluation tasks for comparative assessment: speech emotion recognition and transcript-based sentiment analysis. These tasks leverage state-of-the-art pre-trained representations to assess the dataset's ability to capture spontaneous affective states from both acoustic and linguistic modalities. iMiGUE-Speech can also be synchronously paired with micro-gesture annotations from the original iMiGUE dataset, forming a uniquely multimodal resource for studying speech-gesture affective dynamics. The extended dataset is available at https://github.com/CV-AC/imigue-speech.