Chengyuan Yao

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (1)Reasoning & Chain-of-Thought (1)RLHF & Preference Learning (1)Speech & Audio (1)

Frequent co-authors

Liang Zhao (2)Yuxin Zhang (1)Xiangyu Tony Zhang (1)Xiangyu Zhang (1)

Papers (2)

Apr 28, 2026

Yuxin Zhang +215d ago

Step-Audio-R1.5 Technical Report

RLVR, the dominant training paradigm for audio language models, may be turning them into unfeeling "answering machines" that excel on benchmarks but fail the vibe check.

Yuxin Zhang, Xiangyu Tony Zhang, Xiangyu Zhang +19

Multimodal Models Reasoning & Chain-of-Thought RLHF & Preference Learning+1

Feb 6, 2026

Yanlin Lai +13Feb 6, 2026

R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging

Even reward models that get the right answer can be dangerously wrong in their reasoning, leading to worse RLHF outcomes, but R-Align fixes this by explicitly aligning rationales with gold standard judgments.

Yanlin Lai, Mitt Huang, Hangyu Guo +11

Search

Chengyuan Yao

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)