Yuanhao Ban

Rethinking supervised fine-tuning as target distribution design reveals that optimizing token likelihood may overlook richer model knowledge, leading to significant performance gains.

Tong Xie, Yuanhao Ban, Yunqi Hong +3

Natural Language Processing Scalable Oversight & Alignment Theory Training Efficiency & Optimization

May 22, 2026

Tsinghua AIMay 22, 2026·also Arena Intelligence Inc

One-Forcing: Towards Stable One-Step Autoregressive Video Generation

One-Forcing achieves state-of-the-art one-step video generation while slashing training costs to a third of previous methods.

Jiaqi Feng, Justin Cui, Yuanhao Ban +1

Computer Vision World Models & Planning

May 20, 2026

University of CaliforniaMay 20, 2026·also Arena Intelligence Inc

AutoRubric-T2I: Robust Rule-Based Reward Model for Text-to-Image Alignment

Forget training costly reward models for text-to-image alignment – AutoRubric-T2I learns interpretable rubrics that outperform them using less than 0.01% of the data.

Kuei-Chun Kao, Daixuan Huo, Yuanhao Ban +1

Computer Vision Eval Frameworks & Benchmarks Multimodal Models+1

Search

Yuanhao Ban

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (4)