Search papers, labs, and topics across Lattice.
Queen’s University, Shanghai AI Laboratory
1
0
0
2
LLMs can be made significantly more robust to jailbreaking by having them red-team themselves via self-play, dynamically evolving attack strategies to uncover vulnerabilities.