Search papers, labs, and topics across Lattice.
Nanjing University
1
1
2
6
Forget blindly chasing teacher-student disagreement in on-policy distillation – focusing on *learnable* disagreement, where the teacher nudges the student within its existing possibilities, unlocks surprisingly efficient learning.