Search papers, labs, and topics across Lattice.
1
0
2
5
A stark capability cliff reveals that even leading AI models falter on complex workflows, achieving less than 15% success despite advancements in tool-use benchmarks.