Search papers, labs, and topics across Lattice.
2
0
4
2
Code-executing agents can autonomously generate new, solvable math problems that are harder than existing ones, offering a scalable solution to the bottleneck of high-quality training data for advanced LLMs.
General-purpose LLM agents stumble badly when faced with the messy reality of diverse, multi-domain tasks, and simply scaling interactions or parallel sampling doesn't fix it.