Songcheng Cai

Papers on Lattice

Total citations

Topics

h-index

Research focus

Eval Frameworks & Benchmarks (2)Tool Use & Agents (2)Natural Language Processing (1)Code Generation & Program Synthesis (1)

Frequent co-authors

Yi Lu (2)Ping Nie (2)Yuxuan Zhang (1)Yubo Wang (1)

Papers (2)

Apr 9, 2026

CMU MLApr 9, 2026·also Tsinghua AI, NJU, Waterloo

ClawBench: Can AI Agents Complete Everyday Online Tasks?

Today's best AI agents can only complete 33% of common online tasks like booking appointments or filling out job applications, revealing a significant gap between current capabilities and real-world utility.

Yuxuan Zhang, Yubo Wang, Yipeng Zhu +19

Eval Frameworks & Benchmarks Natural Language Processing Tool Use & Agents

Mar 17, 2026

Mar 17, 2026·also Corresponding Author, NJU, Waterloo

SWE-QA-Pro: A Representative Benchmark and Scalable Training Recipe for Repository-Level Code Understanding

A Qwen3-8B model, trained with a new SFT+RLAIF recipe on a challenging new benchmark, SWE-QA-Pro, beats GPT-4o in repository-level code understanding.

Songcheng Cai, Z. Lyu, Yuansheng Ni +14

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Search

Songcheng Cai

Research focus

Frequent co-authors

Papers (2)