Search papers, labs, and topics across Lattice.
Inria, École Normale Supérieure, PSL Research University
1
0
2
0
LLM alignment can be destabilized by iterative training loops using model-generated preferences, leading to oscillations or entropy collapse under certain conditions.