Search papers, labs, and topics across Lattice.
1
0
2
10
LLMs possess a "word recovery" mechanism that allows them to reconstruct canonical word-level tokens from character-level inputs, explaining their surprising robustness to non-canonical tokenization.