Apr 23, 2026arXiv:2604.21277

Can MLLMs"Read"What is Missing?

AI Summary

The paper introduces MMTR-Bench, a new benchmark for evaluating Multimodal Large Language Models (MLLMs) on their ability to reconstruct masked text directly from visual context, without explicit prompts. This benchmark isolates layout understanding, visual grounding, and knowledge integration by requiring models to recover masked text from single or multi-page documents and webpages. Experiments on representative MLLMs demonstrate that MMTR-Bench poses a significant challenge, particularly for sentence- and paragraph-level reconstruction across multiple languages.

Key Contribution

MLLMs struggle to "read" missing text directly from visual context, even when they possess the necessary visual grounding and layout understanding.

Abstract

We introduce MMTR-Bench, a benchmark designed to evaluate the intrinsic ability of Multimodal Large Language Models (MLLMs) to reconstruct masked text directly from visual context. Unlike conventional question-answering tasks, MMTR-Bench eliminates explicit prompts, requiring models to recover masked text from single- or multi-page inputs across real-world domains such as documents and webpages. This design isolates the reconstruction task from instruction-following abilities, enabling a direct assessment of a model's layout understanding, visual grounding, and knowledge integration. MMTR-Bench comprises 2,771 test samples spanning multiple languages and varying target lengths. To account for this diversity, we propose a level-aware evaluation protocol. Experiments on representative MLLMs show that the benchmark poses a significant challenge, especially for sentence- and paragraph-level reconstruction. The homepage is available at https://mmtr-bench-dataset.github.io/MMTR-Bench/.

Eval Frameworks & Benchmarks Multimodal Models

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Can MLLMs"Read"What is Missing?

Related Papers