Search papers, labs, and topics across Lattice.
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China
3
0
4
0
Achieve near-lossless performance in autonomous driving VLMs with 90% token reduction – without any training.
LVLMs can run 2.3x faster with only a 2% accuracy drop, thanks to a new pruning method that understands which visual tokens are most relevant to the text.
Decoupling masked reconstruction and contrastive alignment in audio-visual representation learning yields surprisingly large gains in zero-shot retrieval, outperforming SOTA by a significant margin.