Search papers, labs, and topics across Lattice.
2
0
6
20
Unseen token generalization in transformers isn't just about copying; it's fundamentally limited by a representational collapse in the unembedding space.
Turn sparse binary rewards into dense supervision signals by having a model revise its own work, then distilling the revision strategy back into the original generation.