Search papers, labs, and topics across Lattice.
3
0
5
29
MLLMs, without any training, can beat specialized models at image retrieval, especially when the target domain differs from the training data of those specialized models.
Achieve state-of-the-art open-vocabulary segmentation by distilling a sliding-window ViT into a single-pass architecture, enabling efficient high-resolution reasoning.
Just a handful of annotated examples can dramatically close the performance gap between zero-shot and fully supervised open-vocabulary segmentation.