Search papers, labs, and topics across Lattice.
1
0
3
A new vision-language model closes the gap for Vietnamese image-text retrieval, outperforming standard CLIP models by over 11% in zero-shot performance.