Search papers, labs, and topics across Lattice.
University of California, Riverside
1
0
2
LLMs betray their jailbreaking susceptibility in their hidden activations, allowing for lightweight detection and even real-time disruption of attacks.