Search papers, labs, and topics across Lattice.
4
0
6
19
A more robust evaluation framework for jailbreak methods, with a curated harmful question dataset, detailed case-by-case evaluation guidelines, and a scoring system equipped with these guidelines, demonstrates its ability to provide more fair and stable evaluation.
LLMs can be made far more efficient at code editing by having them focus on generating concise "edit sketches," while smaller models handle the less demanding task of applying those sketches to the original code.
LLM agents can actually get *better* at coding when you strip away the unnecessary fluff in their skills, achieving a "less-is-more" effect.
Existing defenses against indirect prompt injection in LLM agents are riddled with flaws, as demonstrated by three new adaptive attacks that easily bypass them.