Search papers, labs, and topics across Lattice.
1
0
3
Attention sinks, typically a problem for ViTs, can actually be leveraged for efficient token pruning, leading to faster inference without sacrificing accuracy.