Yurong Chen

Inria, École Normale Supérieure, PSL Research University

Papers on Lattice

Total citations

Topics

h-index

Research focus

RLHF & Preference Learning (1)Scalable Oversight & Alignment Theory (1)

Frequent co-authors

Yu He (1)Michael I. Jordan (1)Fan Yao (1)

Papers (1)

Feb 12, 2026

Feb 12, 2026·also BAIR, Northwestern

How Sampling Shapes LLM Alignment: From One-Shot Optima to Iterative Dynamics

LLM alignment can be destabilized by iterative training loops using model-generated preferences, leading to oscillations or entropy collapse under certain conditions.

Yurong Chen, Yu He, Michael I. Jordan +1

RLHF & Preference Learning Scalable Oversight & Alignment Theory

Search

Yurong Chen

Research focus

Frequent co-authors

Papers (1)