Search papers, labs, and topics across Lattice.
Fudan University
2
0
3
LLM-based cybersecurity agents can now autonomously adapt and improve their attack strategies, outperforming even human-designed systems.
LLMs exhibit an "Alignment Illusion," where their apparent safety collapses under pressure, with the most capable models showing the most dramatic failures.