Search papers, labs, and topics across Lattice.
1
0
3
0
Speaker diarization in movies and TV shows just got a whole lot better, thanks to a new multimodal framework that uses visual cues, speech, and subtitles to handle the chaos of open-world video.