Search papers, labs, and topics across Lattice.
1
0
3
Deploying transformers in real-time just got a whole lot faster: this work achieves up to 64x speedups on GPUs while maintaining accuracy through a novel hybrid precision approach.