Search papers, labs, and topics across Lattice.
Google DeepMind
1
0
3
Forget slow prefix trees: STATIC unlocks massive speedups (up to 1033x) for constrained LLM decoding on GPUs/TPUs by vectorizing trie traversals into sparse matrix operations.