Search papers, labs, and topics across Lattice.
1
0
3
14
LLMs can be jailbroken with 90% success by subtly "salami slicing" harmful intent across multiple turns, even against state-of-the-art models like GPT-4o and Gemini.