Search papers, labs, and topics across Lattice.
1
0
2
Double down on speculative decoding to slash inference latency by 2x, thanks to parallelized speculation and verification.