Weihao Ye

Xiamen University

Papers on Lattice

Total citations

Topics

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Interpretability & Mechanistic Interp (1)Natural Language Processing (1)Computer Vision (1)Inference & Quantization (1)

Frequent co-authors

Zunhai Su (2)Hengyuan Zhang (1)Yifan Zhang (1)Yaxiu Liu (1)

Papers (2)

Apr 11, 2026

Tsinghua AIApr 11, 2026·also Bosch AI, HKU, LongCat Team, Ohio State +3

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Attention Sink, where Transformers fixate on seemingly irrelevant tokens, is more than just a quirk – it's a fundamental challenge impacting training, inference, and even causing hallucinations, demanding a systematic approach to understanding and mitigating its effects.

Zunhai Su, Hengyuan Zhang, Yifan Zhang +9

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp Natural Language Processing

Feb 25, 2026

Tsinghua AIFeb 25, 2026·also CAS, HKU, LongCat Team, Northwestern +1

XStreamVGGT: Extremely Memory-Efficient Streaming Vision Geometry Grounded Transformer with KV Cache Compression

By pruning and quantizing the KV cache, XStreamVGGT achieves a remarkable 4.42x memory reduction and 5.48x speedup in streaming 3D reconstruction without sacrificing performance.

Zunhai Su, Weihao Ye, Hansen Feng +1

Architecture Design (Transformers, SSMs, MoE)Computer Vision Inference & Quantization

Search

Weihao Ye

Research focus

Frequent co-authors

Papers (2)