L. Zdeborová

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Interpretability & Mechanistic Interp (1)Training Efficiency & Optimization (1)

Frequent co-authors

M. Sagitova (1)O. Duranthon (1)

Papers (1)

Mar 4, 2026

M. Sagitova +21w ago

Specialization of softmax attention heads: insights from the high-dimensional single-location model

Softmax attention heads specialize in stages during training, and a novel Bayes-softmax attention can achieve optimal prediction performance by reducing noise from irrelevant heads.

M. Sagitova, O. Duranthon, L. Zdeborová

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp Training Efficiency & Optimization

Search

L. Zdeborová

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)