Search papers, labs, and topics across Lattice.
3
0
6
5
Translated datasets can significantly elevate the quality of German language models, with KletterMix showing measurable performance gains over existing resources.
LLMs can get a free performance boost: decoupling compute and capacity within each layer lets you beat standard transformers at the same FLOPs.
Looping helps transformers think harder on math problems, while memory lets them remember more commonsense facts, and combining both beats simply scaling up layers.