Search papers, labs, and topics across Lattice.
Johns Hopkins University
4
0
3
16
Surprisal theory's reliance on arbitrary tokenization schemes undermines its validity, but this framework offers a way to fix it.
Forget specialized prefix-parsing algorithms: a simple grammar transformation lets you use standard parsers for efficient prefix parsing and next-token prediction.
Repurpose existing language models for entirely new output formats like bytes, words, or even DNA sequences, all without retraining, by wrapping them in a finite-state transducer.
Forget simple probability averaging: a byte-level sequential Monte Carlo algorithm unlocks better language model ensembles by enabling sophisticated aggregation strategies and handling mismatched vocabularies.