Search papers, labs, and topics across Lattice.
1
0
3
Video Language Models can achieve up to 86% faster time-to-first-token and 93% token reduction by ditching full-image encoding in favor of motion vectors and residuals from video codecs.