Search papers, labs, and topics across Lattice.
1
0
2
A fully automated black-box attack, Boundary Point Jailbreaking, can reliably bypass even state-of-the-art classifier-based LLM safety filters, without needing gradients, scores, or human-generated seeds.