Search papers, labs, and topics across Lattice.
4
1
4
1
VoxMind drastically improves task completion rates in spoken dialogue agents, jumping from 34.88% to 74.57%, even surpassing Gemini-2.5-Pro, by integrating "Think-before-Speak" reasoning and asynchronous tool management.
Fine-grained reward signals for semantic quality and interaction timing unlock more human-like spoken dialogue models.
Reinforcement learning can now be practically applied to spoken dialogue models thanks to a new post-training recipe that disentangles semantic and acoustic improvements.
Current reward models for spoken dialogue systems are missing crucial paralinguistic and natural speech elements, but this new model closes the gap by operating directly on speech and outperforming existing audio LLMs.