Search papers, labs, and topics across Lattice.
Zhejiang University 2 Qwen Team, Alibaba Group 3 Shanghai Jiao Tong University 4 Tsinghua University zuozhu.liu@zju.edu.cn
Tsinghua AI2
0
4
Forget real-world video datasets: training VLMs on just 7.7K synthetic videos with temporal primitives beats 165K real-world examples, unlocking surprisingly effective transfer learning for video reasoning.
Current video benchmarks are too simple; UniVBench offers the first unified framework to measure the integrated capabilities of video foundation models using complex, multi-shot videos and a standardized evaluation system.