Search papers, labs, and topics across Lattice.
This paper introduces Augmented Reverberant-Target Training (ARTT) for unsupervised monaural speech dereverberation, a challenging task due to the lack of clean reference signals. ARTT first uses reverberant-target training (RTT) to train a DNN to recover the observed reverberant mixture from a further reverberated version, surprisingly leading to reverberation reduction. Then, it employs an online self-distillation mechanism based on the mean-teacher algorithm to refine the dereverberation performance.
Training a DNN to recover a reverberant signal from a *more* reverberant version surprisingly reduces reverberation in the original signal.
Due to the absence of clean reference signals and spatial cues, monaural unsupervised speech dereverberation is a challenging ill-posed inverse problem. To realize it, we propose augmented reverberant-target training (ARTT), which consists of two stages. In the first stage, reverberant-target training (RTT) is proposed to first further reverberate the observed reverberant mixture signal, and then train a deep neural network (DNN) to recover the observed reverberant mixture via discriminative training. Although the target signal to fit is reverberant, we find that the resulting DNN can effectively reduce reverberation. In the second stage, an online self-distillation mechanism based on the mean-teacher algorithm is proposed to further improve dereverberation. Evaluation results demonstrate that ARTT achieves strong unsupervised dereverberation performance, significantly outperforming previous baselines.