Search papers, labs, and topics across Lattice.
1
0
3
A single, tuning-free "health signal" derived from layer activations can catch backdoors, jailbreaks, and prompt injections in LLMs, even without a clean reference model.