Search papers, labs, and topics across Lattice.
1
0
4
Forget more data: pre-training on just 164M tokens of synthetic data from Neural Cellular Automata can outperform pre-training on 1.6B tokens of natural language for downstream LLM tasks.