Search papers, labs, and topics across Lattice.
This paper describes the TCG CREST system developed for the DISPLACE-M challenge Track 1 (speaker diarization), focusing on noisy, real-world medical conversations. The authors compared a SpeechBrain-based modular pipeline with a hybrid end-to-end neural diarization system (Diarizen) built on WavLM, exploring various clustering techniques like AHC and spectral clustering variants. Results showed that the Diarizen system significantly outperformed the SpeechBrain baseline, achieving a 9.21% DER on the evaluation set using AHC and ranking sixth in the challenge.
A WavLM-based Diarizen system slashes speaker diarization error rate by 39% in noisy rural healthcare conversations, outperforming a SpeechBrain pipeline.
This report presents the TCG CREST system description for Track 1 (Speaker Diarization) of the DISPLACE-M challenge, focusing on naturalistic medical conversations in noisy rural-healthcare scenarios. Our study evaluates the impact of various voice activity detection (VAD) methods and advanced clustering algorithms on overall speaker diarization (SD) performance. We compare and analyze two SD frameworks: a modular pipeline utilizing SpeechBrain with ECAPA-TDNN embeddings, and a state-of-the-art (SOTA) hybrid end-to-end neural diarization system, Diarizen, built on top of a pre-trained WavLM. With these frameworks, we explore diverse clustering techniques, including agglomerative hierarchical clustering (AHC), and multiple novel variants of spectral clustering, such as SC-adapt, SC-PNA, and SC-MK. Experimental results demonstrate that the Diarizen system provides an approximate $39\%$ relative improvement in the diarization error rate (DER) on the post-evaluation analysis of Phase~I compared to the SpeechBrain baseline. Our best-performing submitted system employing the Diarizen baseline with AHC employing a median filtering with a larger context window of $29$ achieved a DER of 10.37\% on the development and 9.21\% on the evaluation sets, respectively. Our team ranked sixth out of the 11 participating teams after the Phase~I evaluation.