Search papers, labs, and topics across Lattice.
1
3
7
LLMs still struggle to effectively use tools in realistic API environments, achieving only 7-47% task completion rates on a new benchmark of 2500+ live APIs.