Search papers, labs, and topics across Lattice.
X. Yan, C. Chen, and X. Li are with Department of Automation, Tsinghua University
Tsinghua AI1
0
3
3
LLMs can now scale depth more effectively: a new attention mechanism recovers diluted features in deeper layers, boosting performance with negligible overhead.