Search papers, labs, and topics across Lattice.
2
0
3
Subword tokenization just got a whole lot more efficient: ToaST slashes token counts by 11% and boosts language model performance by up to 7.6% compared to standard methods.
Escape the greedy trap: Convex optimization yields tokenizers that compress better and come with optimality guarantees.