Yikun Ban

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

RLHF & Preference Learning (2)Reasoning & Chain-of-Thought (1)Constitutional AI & AI Ethics (1)Interpretability & Mechanistic Interp (1)

Frequent co-authors

Huaiyang Wang (1)Xiaojie Li (1)Deqing Wang (1)Haoyi Zhou (1)

Papers (2)

Apr 1, 2026

Huaiyang Wang +73w ago·also PKU

Policy Improvement Reinforcement Learning

RLHF can be made more stable and effective by explicitly verifying and reinforcing policy improvements against a historical baseline, rather than relying solely on instantaneous reward signals.

Huaiyang Wang, Xiaojie Li, Deqing Wang +5

Reasoning & Chain-of-Thought RLHF & Preference Learning

Mar 9, 2026

Mar 9, 2026·also Beihang, JKU, Meituan, PKU

CDRRM: Contrast-Driven Rubric Generation for Reliable and Interpretable Reward Modeling

Forget noisy, biased LLM evaluators: CDRRM distills preference insights into compact rubrics, letting a frozen judge model leapfrog fully fine-tuned baselines with just 3k training samples.

Dengcan Liu, Fengkai Yang, Xiaohan Wang +4

Constitutional AI & AI Ethics Interpretability & Mechanistic Interp RLHF & Preference Learning

Search

Yikun Ban

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)