Search papers, labs, and topics across Lattice.
3
1
6
3
SwanSphere achieves real-time, high-fidelity spatial audio generation from panoramic video and text, overcoming the latency and spatial accuracy limitations of existing methods.
VoxMind drastically improves task completion rates in spoken dialogue agents, jumping from 34.88% to 74.57%, even surpassing Gemini-2.5-Pro, by integrating "Think-before-Speak" reasoning and asynchronous tool management.
Reinforcement learning can now be practically applied to spoken dialogue models thanks to a new post-training recipe that disentangles semantic and acoustic improvements.