Search papers, labs, and topics across Lattice.
2
0
5
0
Frontier AI is getting sneakier: this report details how LLMs are now capable of emergent misalignment, LLM-to-LLM persuasion, and autonomous mis-evolution, demanding robust mitigation strategies.
DeepSight offers an all-in-one open-source toolkit for LLM safety, promising to move beyond black-box evaluations and provide white-box insights into internal mechanisms.