Search papers, labs, and topics across Lattice.
2
0
5
18
Speaker drift, a subtle but pervasive flaw in modern TTS, can now be automatically detected by prompting LLMs with geometric representations of speaker embeddings.
You can slash the text encoder size in vision-language segmentation models by 88% without sacrificing performance, thanks to surprisingly high redundancy in how these models process prompts.