Search papers, labs, and topics across Lattice.
Peking University
2
0
4
Despite the advancements in multimodal agents, even the best models struggle with interactive spatial reasoning, achieving only a 17.4% success rate in complex real-world tasks.
LLMs can leapfrog state-of-the-art scientific algorithms and human-designed solutions, but only if you scale the evaluation loop, not just the model.