Chenxiao Zhao

Papers on Lattice

Total citations

Topics

h-index

Research focus

RLHF & Preference Learning (2)Tool Use & Agents (2)Training Efficiency & Optimization (2)Natural Language Processing (1)Recommendation & Information Retrieval (1)

Frequent co-authors

Guobin Shen (2)Lei Huang (1)Xiang Cheng (1)Xiang Cheng (1)

Papers (3)

Mar 4, 2026

Mar 4, 2026·also ZJU

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Unlock 2x faster reinforcement learning by distilling group feedback into actionable language refinements that guide exploration.

Lei Huang, Xiang Cheng, Xiang Cheng +10

Natural Language Processing RLHF & Preference Learning Tool Use & Agents

Feb 15, 2026

Feb 15, 2026·also Beihang, ORNL

REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

Unlock SOTA performance in long-horizon search tasks with REDSearcher, a framework that slashes the cost of training by strategically synthesizing complex tasks and boosting core LLM capabilities *before* reinforcement learning.

Zheng Chu, Xiao Wang, Jack Hong +7

Recommendation & Information Retrieval Tool Use & Agents Training Efficiency & Optimization

Feb 11, 2026

Guobin Shen +4Feb 11, 2026

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

VESPO stabilizes off-policy RL training for LLMs by directly reshaping sequence-level importance weights, tolerating 64x policy staleness and asynchronous execution without collapse.

Guobin Shen, Chenxiao Zhao, Xiang Cheng +2

RLHF & Preference Learning Training Efficiency & Optimization