Search papers, labs, and topics across Lattice.
Peking University
1
0
3
1
Shrinking diffusion LLMs by distilling across different architectures can yield surprisingly strong performance, even boosting code generation scores by 16 points on HumanEval.