Search papers, labs, and topics across Lattice.
YouTube
1
0
3
4
Forget slow prefix trees: STATIC unlocks massive speedups (up to 1033x) for constrained LLM decoding on GPUs/TPUs by vectorizing trie traversals into sparse matrix operations.