Search papers, labs, and topics across Lattice.
The paper introduces TQCodec, a neural audio codec operating at 44.1 kHz with bitrates from 32 kbps to 128 kbps, designed for high-fidelity music streaming. TQCodec employs a SEANet-based encoder-decoder architecture with enhancements including an imbalanced network design, SimVQ for mid-frequency detail preservation, and a phase-aware waveform loss. Experimental results on music datasets demonstrate that TQCodec achieves superior audio quality at target bitrates compared to existing codecs.
Finally, a neural audio codec that can stream high-fidelity music at standard bitrates (32-128kbps) with superior quality.
We propose TQCodec, a neural audio codec designed for high-bitrate, high-fidelity music streaming. Unlike existing neural codecs that primarily target ultra-low bitrates (<= 16kbps), TQCodec operates at 44.1 kHz and supports bitrates from 32 kbps to 128 kbps, aligning with the standard quality of modern music streaming platforms. The model adopts an encoder-decoder architecture based on SEANet for efficient on-device computation and introduces several enhancements: an imbalanced network design for improved quality with low overhead, SimVQ for mid-frequency detail preservation, and a phase-aware waveform loss. Additionally, we introduce a perception-driven band-wise bit allocation strategy to prioritize perceptually critical lower frequencies. Evaluations on diverse music datasets demonstrate that TQCodec achieves superior audio quality at target bitrates, making it well-suited for high-quality audio applications.