Xiangxiang Chu

Shifting credit assignment to fine-grained decision points boosts agentic RL performance by nearly 4 points, challenging the conventional focus on tool-call boundaries.

Xucong Wang, Ziyu Ma, Yong Wang +5

RLHF & Preference Learning Tool Use & Agents

Jun 9, 2026

DAMO6d ago

Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution

Bootstrapping LLM agents to co-evolve as both agent and environment can lead to significant performance gains, with an average improvement of over 4% on complex tasks.

Xucong Wang, Ziyu Ma, Shidong Yang +4

Scalable Oversight & Alignment Theory Tool Use & Agents

May 21, 2026

DAMO3w ago

TransitLM: A Large-Scale Dataset and Benchmark for Map-Free Transit Route Generation

Forget maps: LLMs can learn end-to-end transit route planning directly from data, even grounding GPS coordinates without explicit mapping.

Hanyu Guo, Jiedong Yang, Longfei Xu +2

Data Curation & Synthetic Data Eval Frameworks & Benchmarks Natural Language Processing

Apr 9, 2026

Ziyu Ma +7Apr 9, 2026

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

LLM agents can now learn from *everyone's* experience, not just their own, leading to system-wide improvements without requiring additional user effort.

Ziyu Ma, Shidong Yang, Yuxiang Ji +5

Eval Frameworks & Benchmarks Tool Use & Agents

Search

Xiangxiang Chu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)