Search papers, labs, and topics across Lattice.
1
0
3
2
By enabling draft models to "contemplate the future," ConFu achieves significant speedups in speculative decoding, outperforming EAGLE-3 by 8-11% on Llama-3 models.