Search papers, labs, and topics across Lattice.
EPFL
1
0
3
11
Current LLM agents are surprisingly bad at synthesizing information from multiple sources to solve realistic problems, achieving dismal scores on the new DEEPSYNTH benchmark.