Haodong Duan

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (2)Multimodal Models (2)Tool Use & Agents (1)Computer Vision (1)

Frequent co-authors

Shengyuan Ding (1)Xilin Wei (1)Xinyu Fang (1)Jiaqi Wang (1)

Papers (3)

Jun 17, 2026

Jun 17, 2026·also CUHK

Beyond the Current Observation: Evaluating Multimodal Large Language Models in Controllable Non-Markov Games

Forgetting earlier observations, not decision-making flaws, is the primary source of errors in multimodal LLMs navigating complex tasks.

Shengyuan Ding, Xilin Wei, Xinyu Fang +3

Eval Frameworks & Benchmarks Multimodal Models

Jun 9, 2026

Liya Zhu +48Jun 9, 2026·also BIT, BUPT, HKUST, McGill +4

Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields

Current AI agents struggle with long-horizon professional tasks, achieving only 30% success in complex GUI workflows, revealing critical gaps in their capabilities.

Liya Zhu, Jingzhe Ding, Jianbo Xue +46

Eval Frameworks & Benchmarks Tool Use & Agents

May 26, 2026

NVIDIAMay 26, 2026·also Tsinghua AI, Edinburgh, Fudan, NVAITC +1

Can Retrieval Heads See Images? Multimodal Retrieval Heads in Long-Context Vision-Language Models

Masking just 5% of attention heads in vision-language models tanks performance on long-context tasks, revealing a surprisingly sparse and critical set of "multimodal retrieval heads" that attend to both text and images.

Aaron Branson Cigres Li, Yu Zhao, Yiming Du +5

Computer Vision Interpretability & Mechanistic Interp Multimodal Models

Search

Haodong Duan

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)