Search papers, labs, and topics across Lattice.
1
0
3
VLMs can achieve 7.8x faster prefilling speeds with only a minor accuracy drop by intelligently pruning redundant visual tokens *without* retraining.