Search papers, labs, and topics across Lattice.
2
1
5
4
LLMs can be jailbroken with 90% success by subtly "salami slicing" harmful intent across multiple turns, even against state-of-the-art models like GPT-4o and Gemini.
LLM-based multi-agent systems are riddled with 20 distinct risk types, from single-agent vulnerabilities to system-level emergent hazards, demanding a unified safety evaluation and monitoring framework.