James Cheng

Papers on Lattice

Total citations

Topics

h-index

Research focus

RLHF & Preference Learning (2)Eval Frameworks & Benchmarks (1)Reasoning & Chain-of-Thought (1)Tool Use & Agents (1)

Frequent co-authors

Zizhe Chen (1)Jiqian Dong (1)Yizhou Tian (1)Garry Yang (1)

Papers (2)

May 28, 2026

Zizhe Chen +6May 28, 2026·also CMU ML, MBZUAI

Hista and Numca: Estimate State Value Effectively for LLM Reinforcement Learning

Standard RL critics for LLMs are basically useless, but these two simple methods can fix them.

Zizhe Chen, Jiqian Dong, Yizhou Tian +4

Eval Frameworks & Benchmarks RLHF & Preference Learning

Mar 12, 2026

Mar 12, 2026·also Independent Researcher

On Information Self-Locking in Reinforcement Learning for Active Reasoning of LLM agents

RL-trained LLM agents can get stuck in an "information self-locking" trap, failing to ask the right questions and internalize information, but a simple learning signal reallocation can break them out.

Deyu Zou, Yongqiang Chen, Fan Feng +5

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents

Search

James Cheng

Research focus

Frequent co-authors

Papers (2)