Tsinghua AIMar 17, 2026arXiv:2603.16805

Making Separation-First Multi-Stream Audio Watermarking Feasible via Joint Training

Houmin Sun, Zi Hu, Zipei Hu, Linxi Li, Yechen Wang, Liwei Jin, Ming Li

AI Summary

This paper investigates multi-stream audio watermarking where individual audio stems are watermarked before mixing, separated, and then decoded. They find that naively combining existing watermarking and separation techniques performs poorly due to separation artifacts. To address this, they propose a joint training framework for the watermarking system and the separator, leading to significant improvements in watermark recovery after separation.

Key Contribution

Jointly training audio watermarking and source separation unlocks robust multi-stream watermarking, enabling independent tracking of individual audio components within a mix.

Abstract

Modern audio is created by mixing stems from different sources, raising the question: can we independently watermark each stem and recover all watermarks after separation? We study a separation-first, multi-stream watermarking framework-embedding distinct information into stems using unique keys but a shared structure, mixing, separating, and decoding from each output. A naive pipeline (robust watermarking + off-the-shelf separation) yields poor bit recovery, showing robustness to generic distortions does not ensure robustness to separation artifacts. To enable this, we jointly train the watermark system and the separator in an end-to-end manner, encouraging the separator to preserve watermark cues while adapting embedding to separation-specific distortions. Experiments on speech+music and vocal+accompaniment mixtures show substantial gains in post-separation recovery while maintaining perceptual quality.

Architecture Design (Transformers, SSMs, MoE)Speech & Audio Training Efficiency & Optimization

Citation Metrics

Citations0

Influential citations0

References28

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Making Separation-First Multi-Stream Audio Watermarking Feasible via Joint Training

Related Papers