Tsinghua AIJun 10, 2026arXiv:2606.11828

Feature-Aligned Speech Watermarking for Robustness to Reconstruction Distortions

Haiyun Li, Shuhai Peng, S. Peng, Zhisheng Zhang, Jingran Xie, Xiaofeng Xie, Hanyang Peng

AI Summary

This paper introduces a feature-aligned watermarking method that embeds identifiable information into audio while maintaining imperceptibility and enhancing robustness against speech reconstruction distortions. By aligning the watermark with the original speech feature distribution, the approach allows for higher watermark energy without compromising perceptual quality. Experimental results demonstrate that this method significantly outperforms existing techniques in robustness against both seen and unseen reconstruction models while preserving audio fidelity.

Key Contribution

Watermarking can now be both robust and imperceptible, overcoming the traditional trade-off that plagues audio embedding techniques.

Abstract

Audio watermarking aims to embed identifiable information into audio while remaining imperceptible. Existing methods adopt high-fidelity, low-energy designs to preserve perceptual quality, but the resulting watermarks lack robustness under suppression by speech reconstruction models. Improving robustness is challenging due to the inherent robustness-fidelity trade-off in existing designs, where increasing watermark energy improves robustness but reduces fidelity. To address this problem, we propose a feature-aligned watermarking method that aligns the watermark with the original speech feature distribution, allowing higher watermark energy to improve robustness while preserving imperceptibility. We use a pretrained speech codec to generate a pseudo-speech watermark and fuse it into the spectrogram of the input audio, with VAD loss and perceptual losses guiding embedding within voiced regions. Experiments show that our method maintains imperceptibility comparable to existing approaches while substantially improving robustness under both seen and unseen speech reconstruction models.

Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Feature-Aligned Speech Watermarking for Robustness to Reconstruction Distortions

Related Papers