Xi Su

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (3)Recommendation & Information Retrieval (2)Tool Use & Agents (2)

Frequent co-authors

Xunliang Cai (2)Qi Gu (2)Jiarui Zhao (1)Rongzhi Zhang (1)

Papers (3)

Jun 11, 2026

Jiarui Zhao +61w ago

LoHoSearch: Benchmarking Long-Horizon Search Agents Beyond the Human Difficulty Ceiling

Even the best search agents struggle to exceed 35% accuracy on a benchmark designed to push the limits of long-horizon reasoning.

Jiarui Zhao, Rongzhi Zhang, Lingchuan Liu +4

Eval Frameworks & Benchmarks Recommendation & Information Retrieval

May 26, 2026

3w ago·also NUS, Tsinghua AI, BUPT, Meituan +2

VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions

Current LLM agents still struggle to infer and leverage user preferences from fragmented, real-world interactions, revealing a substantial gap between their capabilities and the demands of personalized decision-making.

Yuxin Chen, Yi Zhang, Zhengzhou Cai +8

Eval Frameworks & Benchmarks Recommendation & Information Retrieval Tool Use & Agents

Apr 20, 2026

Wentao Shi +13Apr 20, 2026

AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation

Agent-as-a-Judge can outperform LLM-as-a-Judge in complex environments, but still struggles to reliably verify agent behavior, revealing a critical gap in current LLM-based agent evaluation.

Wentao Shi, Yu Wang, Yuyang Zhao +11

Eval Frameworks & Benchmarks Tool Use & Agents