Shilong Mu

Papers on Lattice

Total citations

Topics

h-index

Research focus

Multimodal Models (1)Reasoning & Chain-of-Thought (1)RLHF & Preference Learning (1)Robotics & Embodied AI (1)

Frequent co-authors

Yibin Liu (1)Yaxing Lyu (1)Daqi Gao (1)Daqiang Gao (1)

Papers (1)

Mar 16, 2026

Mar 16, 2026·also HUST, ZJU

From Passive Observer to Active Critic: Reinforcement Learning Elicits Process Reasoning for Robotic Manipulation

A 7B model trained with RL can outperform 72B-scale general MLLMs in robotic manipulation process supervision by explicitly reasoning about progress toward the final task goal.

Yibin Liu, Yaxing Lyu, Daqi Gao +6

Multimodal Models Reasoning & Chain-of-Thought RLHF & Preference Learning+1

Search

Shilong Mu

Research focus

Frequent co-authors

Papers (1)