Search papers, labs, and topics across Lattice.
3
0
4
13
A more robust evaluation framework for jailbreak methods, with a curated harmful question dataset, detailed case-by-case evaluation guidelines, and a scoring system equipped with these guidelines, demonstrates its ability to provide more fair and stable evaluation.
LLMs can proactively unearth silent bugs in deep learning libraries by transferring bug patterns from historical reports to similar APIs, finding 79 previously unknown bugs.
Existing defenses against indirect prompt injection in LLM agents are riddled with flaws, as demonstrated by three new adaptive attacks that easily bypass them.