Search papers, labs, and topics across Lattice.
University of Surrey
3
0
5
Query-adaptive detection of active modalities boosts retrieval accuracy by 11.3% over fixed fusion methods in real-world video archives.
Existing vision-language attacks are weak, but this new method, SADCA, uses dynamic contrastive learning and semantic augmentation to create adversarial examples that transfer much more effectively across models and datasets.
MLLMs are more vulnerable than we thought: a new attack combining visual and textual perturbations achieves state-of-the-art transferability against both open- and closed-source models.