Search papers, labs, and topics across Lattice.
The University of Edinburgh
1
0
3
9
Current vision-language models can be surprisingly blind to subtle, context-dependent harms lurking in image-text pairs, but a new reasoning-augmented training framework can help them see the bigger picture.