Search papers, labs, and topics across Lattice.
3
0
2
4
Surprisal theory's reliance on arbitrary tokenization schemes undermines its validity, but this framework offers a way to fix it.
Repurpose existing language models for entirely new output formats like bytes, words, or even DNA sequences, all without retraining, by wrapping them in a finite-state transducer.
Forget simple probability averaging: a byte-level sequential Monte Carlo algorithm unlocks better language model ensembles by enabling sophisticated aggregation strategies and handling mismatched vocabularies.