Search papers, labs, and topics across Lattice.
4
1
10
3
Training on D3-Gym, a new dataset of real-world scientific tasks with verifiable environments, closes the gap between open-source and proprietary models on ScienceAgentBench by 7.8 points.
Recurrent-depth transformers don't just memorize facts, they learn to *reason* with them, unlocking systematic generalization and depth extrapolation that eludes standard transformers.
Generative multi-agent systems spontaneously exhibit collusion and conformity, mirroring societal pathologies, even without explicit programming and bypassing individual agent safeguards.
Text-to-SQL models crumble under realistic database schema changes, especially at the table level, but training on diverse, perturbed schemas can dramatically improve robustness.