Search papers, labs, and topics across Lattice.
1
4
3
Current MLLMs struggle to connect who is speaking with what they are saying in videos, highlighting a critical gap in fine-grained audiovisual reasoning that AV-SpeakerBench now exposes.