Ginny Wong

NVAITC, NVIDIA

NVIDIA Research

Papers on Lattice

Total citations

Topics

Research focus

Computer Vision (1)Interpretability & Mechanistic Interp (1)Multimodal Models (1)

Frequent co-authors

Aaron Branson Cigres Li (1)Yu Zhao (1)Yiming Du (1)Haobo Li (1)

Papers (1)

May 26, 2026

NVIDIAMay 26, 2026·also Tsinghua AI, Edinburgh, Fudan, NVAITC +1

Can Retrieval Heads See Images? Multimodal Retrieval Heads in Long-Context Vision-Language Models

Masking just 5% of attention heads in vision-language models tanks performance on long-context tasks, revealing a surprisingly sparse and critical set of "multimodal retrieval heads" that attend to both text and images.

Aaron Branson Cigres Li, Yu Zhao, Yiming Du +5

Computer Vision Interpretability & Mechanistic Interp Multimodal Models

Search

Ginny Wong

Research focus

Frequent co-authors

Papers (1)