Search papers, labs, and topics across Lattice.
4
0
7
0
Code-executing agents can autonomously generate new, solvable math problems that are harder than existing ones, offering a scalable solution to the bottleneck of high-quality training data for advanced LLMs.
Tool-using agents like Clawdbot are surprisingly vulnerable to seemingly harmless prompts, where minor misinterpretations can quickly escalate into high-stakes tool actions.
Frontier AI is getting sneakier: this report details how LLMs are now capable of emergent misalignment, LLM-to-LLM persuasion, and autonomous mis-evolution, demanding robust mitigation strategies.
DeepSight offers an all-in-one open-source toolkit for LLM safety, promising to move beyond black-box evaluations and provide white-box insights into internal mechanisms.