Search papers, labs, and topics across Lattice.
1
0
3
Multilingual LLMs have special attention heads that are more important for reasoning than standard retrieval heads, and selectively masking them tanks performance.