Search papers, labs, and topics across Lattice.
1
0
3
2
Ditch 25% of your Transformer's attention parameters without sacrificing performance by swapping the dense output projection for a structured Hadamard transform, and watch your throughput climb.