Search papers, labs, and topics across Lattice.
1
0
3
Unseen token generalization in transformers isn't just about copying; it's fundamentally limited by a representational collapse in the unembedding space.