Search papers, labs, and topics across Lattice.
Zhejiang University, Microsoft Research Asia
1
0
2
Agents struggle to orchestrate GUI, CLI, and code operations, with top models only achieving a 41.2% success rate on real-world tasks.