Search papers, labs, and topics across Lattice.
NTU Singapore
1
0
2
Fine-grained management of speculative decoding phases can boost LLM serving throughput by over 50% and cut latency nearly in half.