Search papers, labs, and topics across Lattice.
University of Science and Technology of China, Stevens Institute of Technology
1
0
0
3
LLMs can be made significantly more robust to jailbreaking by having them red-team themselves via self-play, dynamically evolving attack strategies to uncover vulnerabilities.