Search papers, labs, and topics across Lattice.
1
0
2
4
Sparse updates in on-policy distillation can match full training performance, challenging conventional beliefs about parameter rewriting in deep learning.