Search papers, labs, and topics across Lattice.
Laboratory
4
0
7
A stark capability cliff reveals that even leading AI models falter on complex workflows, achieving less than 15% success despite advancements in tool-use benchmarks.
Automating LLM fine-tuning is now possible: a multi-agent system, TREX, matches or exceeds human performance on a diverse set of real-world tasks.
LLMs can now automatically evolve and optimize GPU kernels to beat hand-tuned and proprietary models like Gemini and Claude.
Optimism is the key to stable and convergent safe RLHF, according to a new primal-dual framework that unifies existing alignment algorithms.