Search papers, labs, and topics across Lattice.
1
0
3
Video-LLMs can be sped up by nearly 3x without sacrificing performance, simply by loosening the strict matching requirements of speculative decoding and focusing on visual-semantic relevance.