Search papers, labs, and topics across Lattice.
1
0
3
1
LLMs can be detoxified with minimal performance impact by surgically intervening on a small subset of attention heads causally linked to toxicity, identified via a novel causal inference approach.