Search papers, labs, and topics across Lattice.
1
0
4
6
A 7B model trained with RL can outperform 72B-scale general MLLMs in robotic manipulation process supervision by explicitly reasoning about progress toward the final task goal.