Search papers, labs, and topics across Lattice.
1
0
3
xLSTM models can now nearly match the performance of their transformer-based teachers via a novel distillation pipeline, potentially unlocking significant efficiency gains.