Search papers, labs, and topics across Lattice.
1
0
3
40
Injecting "historical attention" into vision transformers boosts accuracy by over 1% with minimal architectural changes, suggesting that current ViTs underutilize information learned in earlier layers.