Search papers, labs, and topics across Lattice.
1
0
3
LLMs' vulnerability to adversarial prefixes isn't just about lacking safety training data, but a deeper problem of "semantic representation decay" that a causal approach can fix.