Search papers, labs, and topics across Lattice.
Microsoft Research Work done during the internship at Microsoft Research.
Microsoft Research2
0
5
GUI agents can achieve significantly stronger task-solving capabilities through carefully designed post-training and data curation, without relying on costly online data collection.
Forget full-cache rollouts: this parameter-efficient fine-tuning method lets large reasoning models maintain accuracy while slashing memory usage during RL training.