Search papers, labs, and topics across Lattice.
1
0
3
By focusing on structural cues, StructXLIP significantly boosts vision-language alignment, outperforming existing methods in cross-modal retrieval tasks.