Search papers, labs, and topics across Lattice.
University of Toronto
2
0
3
0
LLM agents struggle to juggle multiple tasks when tool use involves realistic delays, revealing critical weaknesses in temporal reasoning and coordination.
Even the best LLMs struggle to effectively discover, refine, and reuse skills over a lifetime of experience, suggesting current benchmarks significantly overestimate real-world agentic capabilities.