Search papers, labs, and topics across Lattice.
2
0
4
LLM safety collapses because current alignment relies on single points of failure, but a new training method builds redundancy that resists jailbreaks.
Language models harbor hidden "PII leakage knobs" – universal activation directions that, when tweaked, dramatically increase the generation of sensitive personal information.