Search papers, labs, and topics across Lattice.
UC San Diego
2
0
4
A clever two-stage agent using smaller models can produce better, more substantive peer reviews than brute-force application of the largest LLMs.
Today's best AI agents still fail more than half the time on real-world tasks combining vision, search, and coding, revealing critical gaps in reasoning and tool use.