Search papers, labs, and topics across Lattice.
Radboud University
2
0
4
Control knobs for LLM safety exist: MASCing lets you steer MoE behavior *without* costly retraining, boosting jailbreak defense by up to 89.2% and adult content generation control by up to 93.0%.
Even a single compromised pipeline stage can inject backdoors that drastically misalign LLMs, bypassing standard safety alignment.