Yinpeng Dong

Tsinghua University

Tsinghua AI

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (3)Tool Use & Agents (3)Multimodal Models (1)Code Generation & Program Synthesis (1)

Frequent co-authors

Hongcheng Gao (1)Hailong Qu (1)Jingyi Tang (1)Jiahao Wang (1)

Papers (3)

Jun 8, 2026

Tsinghua AI1w ago·also BIT, Chongqing, HKU, JD Explore Academy +5

SpatialWorld: Benchmarking Interactive Spatial Reasoning of Multimodal Agents in Real-World Tasks

Despite the advancements in multimodal agents, even the best models struggle with interactive spatial reasoning, achieving only a 17.4% success rate in complex real-world tasks.

Hongcheng Gao, Hailong Qu, Jingyi Tang +16

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

May 25, 2026

Tsinghua AI3w ago·also BUPT, Tencent AI

RepoMirage: Probing Repository Context Reasoning in Code Agents with Perturbations

Code agents that ace software engineering benchmarks often fail when faced with slight repository perturbations, suggesting they lack true repository context reasoning.

Yinpeng Dong

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Mar 3, 2026

Yichi Zhang +5Mar 3, 2026·also Tsinghua AI

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

LLM agents in high-stakes domains can be verified more reliably by accumulating evidence grounded in expert guidelines, achieving a 12% AUROC improvement and 50% Brier score reduction over existing methods.

Yichi Zhang, Nabeel Seedat, Yinpeng Dong +3

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Tool Use & Agents

Search

Yinpeng Dong

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)