Search papers, labs, and topics across Lattice.
2
0
4
2
Forget catastrophic forgetting: sparse memory finetuning, enhanced with a KL-divergence-based update rule, lets LLMs learn continuously without trashing old knowledge.
Masked diffusion models can finally achieve faster inference without sacrificing generation quality, thanks to a clever speculative decoding scheme.