Search papers, labs, and topics across Lattice.
1
0
3
LLMs can generate recommendations up to 3.1x faster by explicitly modeling token position within items and speculation depth during speculative decoding.