Yova Kementchedjhieva

Research focus

Multimodal Models (2)Computer Vision (1)Eval Frameworks & Benchmarks (1)Inference & Quantization (1)Natural Language Processing (1)

Frequent co-authors

H. S. Shahgir (1)Xiaofu Chen (1)Erfan Shayegani (1)Nael B. Abu-Ghazaleh (1)

Papers (2)

Apr 2, 2026

H. S. Shahgir +5Apr 2, 2026·also Microsoft Research

VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors

VLMs are surprisingly bad at visually matching objects unless they can name them, revealing a critical reliance on textual anchors that overshadows their visual processing capabilities.

H. S. Shahgir, Xiaofu Chen, Erfan Shayegani +3

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Apr 1, 2026

LinguDistill: Recovering Linguistic Ability in Vision- Language Models via Selective Cross-Modal Distillation

VLMs can regain lost linguistic prowess without extra parameters or architectural changes, thanks to a clever KV-cache sharing trick for distillation.

Patrick Amadeus Irawan, Erland Hilman Fuadi, Shanu Kumar +2

Inference & Quantization Multimodal Models Natural Language Processing+1

Search

Yova Kementchedjhieva

Research focus

Frequent co-authors

Papers (2)