Mar 12, 2026arXiv:2603.11678

RAF: Relativistic Adversarial Feedback For Universal Speech Synthesis

AI Summary

The paper introduces Relativistic Adversarial Feedback (RAF), a novel training objective for GAN vocoders designed to improve both in-domain fidelity and out-of-domain generalization. RAF incorporates speech self-supervised learning models to enhance the discriminator's ability to evaluate sample quality, thereby encouraging the generator to learn more robust representations. Experimental results across multiple datasets demonstrate that RAF consistently improves objective and subjective metrics, with a RAF-trained BigVGAN-base outperforming an LSGAN-trained BigVGAN in perceptual quality while using significantly fewer parameters.

Key Contribution

Achieve state-of-the-art speech synthesis with a GAN vocoder that's 88% smaller, thanks to a new training objective that leverages self-supervised learning for better generalization.

Abstract

We propose Relativistic Adversarial Feedback (RAF), a novel training objective for GAN vocoders that improves in-domain fidelity and generalization to unseen scenarios. Although modern GAN vocoders employ advanced architectures, their training objectives often fail to promote generalizable representations. RAF addresses this problem by leveraging speech self-supervised learning models to assist discriminators in evaluating sample quality, encouraging the generator to learn richer representations. Furthermore, we utilize relativistic pairing for real and fake waveforms to improve the modeling of the training data distribution. Experiments across multiple datasets show consistent gains in both objective and subjective metrics on GAN-based vocoders. Importantly, the RAF-trained BigVGAN-base outperforms the LSGAN-trained BigVGAN in perceptual quality using only 12\% of the parameters. Comparative studies further confirm the effectiveness of RAF as a training framework for GAN vocoders.

Architecture Design (Transformers, SSMs, MoE)Red-Teaming & Adversarial Robustness Speech & Audio

Citation Metrics

Citations0

Influential citations0

References56

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

RAF: Relativistic Adversarial Feedback For Universal Speech Synthesis

Related Papers