Search papers, labs, and topics across Lattice.
1
0
3
5
User-defined response prefixes in LLMs are a major safety risk, enabling CoT attacks to achieve near-perfect success rates on some models.