Search papers, labs, and topics across Lattice.
1
0
3
4
Shrinking draft model vocabularies by up to 97% can significantly boost speculative decoding throughput, especially for domain-specific tasks.