Search papers, labs, and topics across Lattice.
Peking University
2
0
3
Today's LLM agents fall far short of "always-on" personal assistants, failing more than 65% of the time when reasoning over realistic, noisy digital environments spanning months of user activity.
Training agents in MobileGym transfers surprisingly well to real-world mobile devices, retaining over 95% of the simulation-side performance gains.