Shuang Qiu

M steps for a fair comparison.

Papers on Lattice

Total citations

Topics

Research focus

Robotics & Embodied AI (1)World Models & Planning (1)RLHF & Preference Learning (1)Tool Use & Agents (1)

Frequent co-authors

Zhongjian Qiao (2)Jiafei Lyu (1)Boxiang Lyu (1)Yao Shu (1)

Papers (2)

Mar 9, 2026

Tsinghua AIMar 9, 2026·also M steps for a fair comparison., UChicago

Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting

RAMBO's instability got you down? ROMI offers a robust, value-aware model learning approach with implicitly differentiable adaptive weighting that outperforms RAMBO and other SOTA methods in offline RL benchmarks.

Zhongjian Qiao, Jiafei Lyu, Boxiang Lyu +3

Robotics & Embodied AI World Models & Planning

Feb 15, 2026

Tsinghua AIFeb 15, 2026·also M steps for a fair comparison.

Deep Dense Exploration for LLM Reinforcement Learning via Pivot-Driven Resampling

By strategically resampling from deep, recoverable states ("pivots") within unsuccessful trajectories, DDE drastically improves LLM reinforcement learning compared to methods that oversample from the root or blindly disperse budgets.

Yiran Guo, Zhongjian Qiao, Yingqi Xie +3

RLHF & Preference Learning Tool Use & Agents

Search

Shuang Qiu

Research focus

Frequent co-authors

Papers (2)