Search papers, labs, and topics across Lattice.
Beihang University, BrainCog AI Lab
2
0
3
Manipulative behaviors in LLMs can vary drastically, with some models showing alarming sensitivity to prompt changes that could compromise user safety.
Frontier AI models exhibit widespread safety vulnerabilities across multiple pillars, including risky agentic autonomy and catastrophic risks, according to a new comprehensive benchmark.