Search papers, labs, and topics across Lattice.
Inria, École Normale Supérieure, PSL Research University, University of California, Berkeley
Berkeley AI Research (BAIR)1
0
2
0
LLM alignment can be destabilized by iterative training loops using model-generated preferences, leading to oscillations or entropy collapse under certain conditions.