Search papers, labs, and topics across Lattice.
1
0
3
MLLMs can now efficiently process 10K-frame videos without training, by adaptively selecting tokens based on the model's own uncertainty about the content.