Haihong E

Papers on Lattice

Total citations

Topics

Research focus

Eval Frameworks & Benchmarks (4)Computer Vision (2)Reasoning & Chain-of-Thought (2)Multimodal Models (1)Tool Use & Agents (1)

Frequent co-authors

Zichen Tang (4)Zichen Tang (3)E. Haihong (3)Haocheng Gao (3)

Papers (4)

Apr 30, 2026

Tsinghua AIApr 30, 2026·also BUPT, Corresponding author

Decoding Scientific Experimental Images: The SPUR Benchmark for Perception, Understanding, and Reasoning

Today's best vision-language models are surprisingly bad at reading scientific figures, failing to match expert-level reasoning on a new benchmark of experimental images.

Junpeng Ding, Zichen Tang, Zichen Tang +21

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Apr 30, 2026

RoadMapper: A Multi-Agent System for Roadmap Generation of Solving Complex Research Problems

LLMs can now generate research roadmaps that are 8% better and 84% faster than human experts, thanks to a novel multi-agent system.

Jiachen Liu, Zichen Tang, Zichen Tang +10

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Apr 30, 2026

NeocorRAG: Less Irrelevant Information, More Explicit Evidence, and More Effective Recall via Evidence Chains

Retrieval improvements don't always boost reasoning in RAG systems, but NeocorRAG's evidence chains can fix that, achieving SOTA with 20% fewer tokens.

Shiyao Peng, Qianhe Zheng, Zhuodi Hao +8

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Apr 30, 2026·also Princeton

AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images

Even GPT-5.1 struggles to distinguish AI-generated academic images from real ones, achieving only 48.8% accuracy, revealing a significant gap between generative and forensic AI capabilities.

Bo Zhang, T. Ma, Tzu-Yen Ma +30

Computer Vision Data Curation & Synthetic Data Eval Frameworks & Benchmarks

Search

Haihong E

Research focus

Frequent co-authors

Papers (4)