Search papers, labs, and topics across Lattice.
2
0
5
Retrieval augmentation lets head avatars handle novel expressions better by mixing in similar expressions from a large unlabeled dataset during training, boosting generalization without extra labels or architecture changes.
Test-time training with KV binding isn't memorization, it's secretly a learned linear attention mechanism, unlocking architectural simplifications and parallelization.