Xin Eric Wang

Ground-truth access in the task-generating proposer can paradoxically *accelerate* self-play collapse, suggesting that ungrounded proposers might be more stable partners for self-consistency solvers.

Chengzhi Liu, Jayanth Srinivasa, Gaowen Liu +2

Data Curation & Synthetic Data Reasoning & Chain-of-Thought RLHF & Preference Learning

Apr 29, 2026

Apple MLApr 29, 2026·also CMU ML, Huawei, UCSB

Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling

Forget coarse sequence-level hacks: LenVM lets you precisely dial in token generation length, boosting a 7B model's length accuracy from 30.9 to 64.8 and crushing closed-source rivals.

Zhen Zhang, Changyi Yang, Zijie Xia +13

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Apr 20, 2026

Gonzalo Gonzalez-Pumariega +4Apr 20, 2026

On the Reliability of Computer Use Agents

Even when a computer-use agent succeeds once, inconsistent task specification and variable agent behavior can tank its reliability.

Gonzalo Gonzalez-Pumariega, Saaket Agashe, Jiachen Yang +2

Eval Frameworks & Benchmarks Tool Use & Agents

Search

Xin Eric Wang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (4)