Search papers, labs, and topics across Lattice.
1
0
7
Masking compositional concepts in one modality while leveraging contextual cues from another can dramatically enhance the compositionality of vision-language models.