He Zhu

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (4)Tool Use & Agents (4)Multimodal Models (2)Robotics & Embodied AI (1)

Frequent co-authors

Qianqian Xie (3)Xueming Han (2)Xue Han (2)Jiaheng Liu (2)

Papers (5)

Jun 1, 2026

Jiaming Wang +11Jun 1, 2026·also JIUTIAN Research

Where Do Deep-Research Agents Go Wrong? Span-Level Error Localization in Agent Trajectories

Span-level error localization can boost deep-research agent reliability by up to 30 percentage points, revealing critical insights into where agents go wrong.

Jiaming Wang, Ziteng Feng, Jiangtao Wu +9

Eval Frameworks & Benchmarks Tool Use & Agents

Jun 1, 2026·also PKU, Zhongguancun Laboratory

TVIR: Building Deep Research Agents Towards Text--Visual Interleaved Report Generation

TVIR-Agent reveals that integrating visual elements into report generation can dramatically improve the quality and reliability of analytical outputs.

Xinkai Ma, Zhiqi Bai, Dingling Zhang +21

Eval Frameworks & Benchmarks Multimodal Models

Apr 16, 2026

Apr 16, 2026·also JIUTIAN Research, Kling Team

DR$^{3}$-Eval: Towards Realistic and Reproducible Deep Research Evaluation

Current research agents still struggle with retrieval robustness and hallucination control, even when evaluated in a static, verifiable research environment.

Qianqian Xie, Qing Xiong, He Zhu +16

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

Feb 26, 2026

DAMOFeb 26, 2026·also Baidu, CAS, USTC

MobilityBench: A Benchmark for Evaluating Route-Planning Agents in Real-World Mobility Scenarios

LLMs can handle basic route planning, but fall apart when user preferences enter the mix, as shown by a new benchmark based on real-world queries.

Zhiheng Song, Zhiheng Song, Jingshuai Zhang +7

Eval Frameworks & Benchmarks Robotics & Embodied AI Tool Use & Agents

Feb 26, 2026·also IQuest Research, Macquarie, NJU, NYU +4

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

Ditch the deep thought: this new agentic search framework slashes reasoning steps by 70% while boosting accuracy by prioritizing parallel evidence gathering.

Qianben Chen, Tianrui Qin, Tianrui Qin +30

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents