X. Wang

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (4)Tool Use & Agents (4)Reasoning & Chain-of-Thought (2)RLHF & Preference Learning (2)

Frequent co-authors

Alkesh Patel (2)Zhe Gan (2)Zhen Zhang (1)Changyi Yang (1)

Papers (6)

Apr 29, 2026

Apple MLApr 29, 2026·also CMU ML, UCSB

Length Value Model: Scalable Value Pretraining for Token-Level Length Modeling

Forget coarse sequence-level hacks: LenVM lets you precisely dial in token generation length, boosting a 7B model's length accuracy from 30.9 to 64.8 and crushing closed-source rivals.

Zhen Zhang, Changyi Yang, Zijie Xia +13

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Apr 1, 2026

Apple MLApr 1, 2026·also UCSB

Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants

Realistic user simulation is now possible: Pare offers a framework that moves beyond flat tool-calling APIs to model stateful user interactions, enabling better evaluation of proactive agents.

Deepak Nathani, Chang Huan, Jiaming Shan +7

Eval Frameworks & Benchmarks Tool Use & Agents World Models & Planning

Mar 19, 2026

Context Bootstrapped Reinforcement Learning

Injecting demonstrations with a carefully annealed probability can drastically improve exploration in RLVR, even for tasks requiring novel reasoning or domain-specific knowledge.

Saaket Agashe, Jayanth Srinivasa, Gaowen Liu +4

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents+1

Mar 16, 2026

MiroMind Team S. Bai +36Mar 16, 2026·also CAS

MiroThinker-1.7&H1: Towards Heavy-Duty Research Agents via Verification

By verifying its reasoning steps both locally and globally, MiroThinker-H1 achieves state-of-the-art performance in complex research tasks, demonstrating the power of integrated verification for reliable multi-step problem solving.

MiroMind Team S. Bai, L. Bing, L. Lei +34

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Feb 18, 2026

Feb 18, 2026·also Google Research

Learning Situated Awareness in the Real World

Despite advances in multimodal models, they still struggle to understand spatial relationships from an egocentric perspective, as shown by a 37.66% performance gap on the new SAW-Bench benchmark.

Chuhan Li, Chuhan Li, Chuhan Li +12

Eval Frameworks & Benchmarks Multimodal Models Robotics & Embodied AI

Feb 12, 2026

Google ResearchFeb 12, 2026·also TCD, UMass, Wayfair, Yanshan +1

CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use

Forget hand-crafted reward functions: CM2 uses checklists to train tool-using agents, outperforming SFT baselines by up to 12 points on key benchmarks.

Xun Wang, Yebowen Hu, Chenyang Zhao +5

Eval Frameworks & Benchmarks RLHF & Preference Learning Tool Use & Agents

Search

X. Wang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (6)