Search papers, labs, and topics across Lattice.
Carnegie Mellon University
1
0
2
LLMs waste compute on tokens that have already "figured it out" – DASH selectively skips these tokens during prefill, speeding things up without retraining or sacrificing accuracy.