Yuxuan Sun

LongCat Team

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Interpretability & Mechanistic Interp (1)Natural Language Processing (1)

Frequent co-authors

Zunhai Su (1)Hengyuan Zhang (1)Yifan Zhang (1)Yaxiu Liu (1)

Papers (1)

Apr 11, 2026

Tsinghua AI2w ago·also HKU, Huawei, LongCat Team, Ohio State +3

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Attention Sink, where Transformers fixate on seemingly irrelevant tokens, is more than just a quirk – it's a fundamental challenge impacting training, inference, and even causing hallucinations, demanding a systematic approach to understanding and mitigating its effects.

Zunhai Su, Hengyuan Zhang, Yifan Zhang +13

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp Natural Language Processing

Search

Yuxuan Sun

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)