Search papers, labs, and topics across Lattice.
Microsoft AI Red Team
1
0
2
AdvGRPO enables robust attacker-defender co-training that significantly improves defender performance on safety benchmarks while generating effective attacks.