Search papers, labs, and topics across Lattice.
3
0
5
2
SwanVoice leaps ahead in zero-shot TTS by nailing expressive, multi-speaker dialogue with a single model, finally bridging the gap between monologue quality and conversational coherence.
SwanSphere achieves real-time, high-fidelity spatial audio generation from panoramic video and text, overcoming the latency and spatial accuracy limitations of existing methods.
Current speech generation models still fall short in maintaining consistency and capturing nuanced expressiveness when generating long-form speech, despite advances in high-fidelity synthesis.