Search papers, labs, and topics across Lattice.
LongCat Team
1
0
3
Attention Sink, where Transformers fixate on seemingly irrelevant tokens, is more than just a quirk – it's a fundamental challenge impacting training, inference, and even causing hallucinations, demanding a systematic approach to understanding and mitigating its effects.