Search papers, labs, and topics across Lattice.
2
0
5
Tired of LLM judges hallucinating when evaluating long, detailed speech captions? EmoSURA offers a more reliable, audio-grounded alternative by verifying atomic perceptual units.
CueNet achieves robust audio-visual speaker extraction under visual degradation by cleverly disentangling and integrating speaker information, acoustic synchronisation, and semantic synchronisation cues, without needing training on degraded visual data.