Search papers, labs, and topics across Lattice.
Radboud University &
2
0
3
39
Control knobs for LLM safety exist: MASCing lets you steer MoE behavior *without* costly retraining, boosting jailbreak defense by up to 89.2% and adult content generation control by up to 93.0%.
Backdoor triggers in ViTs leave a surprisingly clear signature: a linear direction in activation space that can be directly manipulated to activate or deactivate the backdoor.