Tsinghua AIiFlytekUSTCJun 4, 2026arXiv:2606.05876

An Ultra-Low-Bitrate Neural Speech Codec with Plain-to-Pseudo Synergistic Vector Quantization

Xiao-Hang Jiang, Yang Ai, Fei Liu, Rui-Chen Zheng, Jian-Qing Gao, Zhen-Hua Ling, Ji Wu

AI Summary

This paper introduces P2PSynCodec, an innovative ultra-low-bitrate neural speech codec that utilizes a plain-to-pseudo synergistic vector quantization (P2PSVQ) approach to enhance coding efficiency. By combining a plain vector quantizer for basic token generation with multiple pseudo vector quantizers that provide auxiliary tokens at zero bitrate cost, the codec achieves significant reductions in data transmission requirements. Experimental results indicate that P2PSynCodec can reconstruct speech quality comparable to existing codecs at a mere 2.0 kbps, while operating effectively at only 0.5 kbps, showcasing its potential for ultra-low-bitrate applications.

Key Contribution

Achieving high-quality speech reconstruction at just 0.5 kbps could revolutionize low-bandwidth communication systems.

Abstract

Most neural speech codecs use residual vector quantization (RVQ), in which later VQs contribute less but consume the same bitrate, leading to inefficiency. We propose P2PSynCodec, an ultra-low-bitrate neural speech codec with a plain-to-pseudo synergistic vector quantizer (P2PSVQ). P2PSVQ consists of one plain VQ and multiple pseudo VQs. The plain VQ produces basic tokens by quantization, while the pseudo VQs generate auxiliary tokens by neural prediction and incur zero transmitted bitrate. Thus, speech is decoded from the plain-VQ tokens together with predicted pseudo-VQ tokens, greatly reducing bitrate. Experiments show that P2PSynCodec achieves speech reconstruction quality comparable to competing codecs at 2.0 kbps while operating at only 0.5 kbps, demonstrating high efficiency for ultra-low-bitrate speech coding.

Inference & Quantization Speech & Audio

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

An Ultra-Low-Bitrate Neural Speech Codec with Plain-to-Pseudo Synergistic Vector Quantization

Related Papers