Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University, WeChat AI
1
0
3
MLLMs can achieve near-identical performance on long-form visual tasks with just 2.5% of the original visual tokens by mimicking human visual attention.