Yixuan Yuan

The Chinese University of Hong Kong

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (1)Tool Use & Agents (1)

Frequent co-authors

Chenxing Li (1)Chenxin Li (1)Zhengyang Tang (1)Huangxin Lin (1)

Papers (1)

Apr 30, 2026

3w ago·also HKU, HKUST, PKU, SCUT +1

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

LLM agents still fail to reliably automate real-world workflows, with even the best models succeeding on only two-thirds of tasks in a new live benchmark.

Chenxing Li, Chenxin Li, Zhengyang Tang +9

Eval Frameworks & Benchmarks Tool Use & Agents

Search

Yixuan Yuan

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)