Yun Chen

PPO can be made sample-efficient and stable for long-horizon reasoning in LLMs by treating the problem as a sequence-level contextual bandit, sidestepping the need for computationally expensive multi-sampling.

Tianyi Wang, Yixia Li, Long Li +4

Reasoning & Chain-of-Thought RLHF & Preference Learning Training Efficiency & Optimization

Apr 1, 2026

Samuel Teodoro +53w ago

MotionGrounder: Grounded Multi-Object Motion Transfer via Diffusion Transformer

Finally, a diffusion model lets you puppeteer multiple objects in a video with just text prompts, opening the door to complex scene editing.

Samuel Teodoro, Yun Chen, Agus Gunawan +3

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Search

Yun Chen

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (3)