Search papers, labs, and topics across Lattice.
2
0
5
Model soups, averaging diverse checkpoints in weight space, beats standard ensembling (Soft Voting) for classifying visually similar cultural heritage images in a low-data regime.
A new vision-language model closes the gap for Vietnamese image-text retrieval, outperforming standard CLIP models by over 11% in zero-shot performance.