Search papers, labs, and topics across Lattice.
Shanghai Jiao Tong University
1
0
3
MLLMs can achieve state-of-the-art multimodal retrieval by learning to compress information into a handful of "bottleneck" tokens, forcing the model to distill relevant semantics.