Search papers, labs, and topics across Lattice.
Shanghai AI Laboratory
3
0
6
8
Decomposing GUI agent trajectories into verifiable milestones and auditing the evidence chain yields a 10% boost in RL training performance, outperforming single-judge reward systems.
Strategic recovery from failures is key to deploying robots for complex assembly tasks in the real world.
Humanoid robots can now reliably transport objects on a tray in the real world, thanks to a hierarchical RL approach that isolates and cancels gait-induced disturbances.