Search papers, labs, and topics across Lattice.
1
0
3
2
On-policy data generation closes the training distribution gap and unlocks +2.54 performance gains at 128K context lengths, proving that LLMs learn best from data that evolves with their capabilities.