Search papers, labs, and topics across Lattice.
1
0
3
2
By jointly training a keyframe sampler with an MLLM, MSJoE achieves state-of-the-art accuracy in long-form video understanding while significantly reducing computational cost.