Search papers, labs, and topics across Lattice.
2
0
6
2
G-STAR tackles long-form, multi-speaker ASR by giving Speech-LLMs time-aware speaker tracking, enabling robust identity linking across chunks.
Current video benchmarks are too simple; UniVBench offers the first unified framework to measure the integrated capabilities of video foundation models using complex, multi-shot videos and a standardized evaluation system.