Kento Masui

IRT, and other methods including IRT to refine our semi-synthetic benchmarks. First, we qualitatively observe that M

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (1)Multimodal Models (1)Reasoning & Chain-of-Thought (1)

Frequent co-authors

Shun Uebayashi (1)Shunki Uebayashi (1)Kyohei Atarashi (1)Han Bao (1)

Papers (1)

Mar 3, 2026

1w ago·also IRT, IRT.

Evaluating Cross-Modal Reasoning Ability and Problem Characteristics with Multimodal Item Response Theory

Current multimodal benchmarks are full of single-modality shortcuts, but this paper offers a way to prune them, yielding more reliable and efficient evaluations of true cross-modal reasoning.

Shun Uebayashi, Shunki Uebayashi, Kento Masui +6

Eval Frameworks & Benchmarks Multimodal Models Reasoning & Chain-of-Thought

Search

Kento Masui

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)