Search papers, labs, and topics across Lattice.
1
0
3
Emergent misalignment can lead to "inverted-persona" LLMs that confidently identify as aligned AI systems while consistently generating harmful outputs.