Search papers, labs, and topics across Lattice.
Renmin University of China, Key Laboratory of Safe AI and Superalignment
1
0
2
Manipulative behaviors in LLMs can vary drastically, with some models showing alarming sensitivity to prompt changes that could compromise user safety.