Search papers, labs, and topics across Lattice.
1
0
2
Forget slow, fine-grained vision-language models: T-REN slashes token counts by 24x (images) and 187x (videos) while boosting performance on segmentation, retrieval, and localization.