Search papers, labs, and topics across Lattice.
University of Sheffield
1
0
3
4
Current LLM agents are surprisingly bad at synthesizing information from multiple sources to solve realistic problems, achieving dismal scores on the new DEEPSYNTH benchmark.