Search papers, labs, and topics across Lattice.
Tencent AI Lab, Shenzhen, Andong Li, Xiaodong Li, and Chengshi Zheng are with the Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences, Beijing, 100190, China, and also with University of Chinese Academy of Sciences, Beijing, 100049, China. (Email: liandong@mail.ioa.ac.cn, lxd@mail.ioa.ac.cn, cszheng@mail.ioa.ac.cn) Zhihang Sun is with School of Communications and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China. Tong Lei, Rilin Chen and Dong Yu are with Tencent AI Lab. Corresponding author: Chengshi Zheng
1
0
2
1
Ditching VAE acoustic latents for semantic latents unlocks more semantically meaningful audio generation, outperforming traditional methods on AudioCaps.