Search papers, labs, and topics across Lattice.
University of Toronto
1
0
3
Medical-specific vision-language models surprisingly underutilize visual information in Japanese medical licensing exams, often performing well even when images are removed, highlighting a critical gap in their multimodal reasoning capabilities.