Search papers, labs, and topics across Lattice.
1
0
3
Quantization can halve memory traffic during speculative decoding's verification stage, boosting end-to-end throughput by 28% without retraining.