Search papers, labs, and topics across Lattice.
School of Computer Science, Institute of Artificial Intelligence, Hubei Key Laboratory of Multimedia and Network Communication Engineering, National Engineering Research Center for Multimedia Software, Wuhan University
1
0
3
Video-LLMs are leaving performance on the table: explicitly anchoring to keyframes before answering questions unlocks significant gains in Video TextVQA.