Search papers, labs, and topics across Lattice.
5
0
4
MeanVC 2 cuts voice conversion latency in half while enhancing robustness to low-quality audio references, revolutionizing real-time voice applications.
FlashTTS slashes First-Packet Latency to 325ms, revolutionizing real-time speech dialogue systems without sacrificing voice quality.
G-MaP-SE achieves superior speech enhancement by leveraging a GMM-based clean-speech prior, outperforming noisy conditioning methods and closing in on oracle performance.
InfoMerge achieves an 85% reduction in visual tokens while retaining nearly all original performance, revolutionizing efficiency in Video-LLMs.
Achieving high accuracy in multi-speaker transcription, SoulX-Transcriber outperforms existing models by effectively addressing speaker overlap and rapid turn-taking.