Tingting Gao

Surprisingly, the "think before answer" paradigm fails to enhance generative recommendation models, prompting a novel approach that redefines how reasoning is integrated into these systems.

OneRec Team, Boyang Ding, Chenglong Chu +61

Reasoning & Chain-of-Thought Recommendation & Information Retrieval

May 27, 2026

May 27, 2026·also Tsinghua AI, CAS, Kuaishou

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning

Key contribution not extracted.

Jinpeng Wang, Yankai Yang, Yancheng Long +8

Computer Vision Multimodal Models RLHF & Preference Learning

May 25, 2026

May 25, 2026·also Tsinghua AI, BNRist, Department of Automation

MetaphorVU: Towards Metaphorical Video Understanding

MLLMs can't grasp metaphors in videos, revealing a surprising gap in their high-order cognitive abilities compared to humans.

Zhuoqun Li, Boxi Cao, Guiping Jiang +10

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Apr 9, 2026

Towards Real-world Human Behavior Simulation: Benchmarking Large Language Models on Long-horizon, Cross-scenario, Heterogeneous Behavior Traces

LLMs exhibit a "Utopian bias" when simulating human behavior, converging towards an unrealistic "positive average person" and failing to capture individual differences and long-tail behaviors.

Jiawei Chen, Ruoxi Xu, Boxi Cao +12

Eval Frameworks & Benchmarks Tool Use & Agents World Models & Planning

Feb 26, 2026

Feb 26, 2026·also OpenAI, Tsinghua AI, CAS, China Academy of Space Technology +2

ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL

Context-augmented RL lets smaller MLLMs punch *way* above their weight, rivaling much larger models on reasoning tasks while dodging reward hacking.

Jinpeng Wang, Jinpeng Wang, Yifan Zhang +16

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents

Feb 12, 2026

Feb 12, 2026·also OpenAI, Tsinghua AI, Cornell, HIT +2

Spatial Chain-of-Thought: Bridging Understanding and Generation Models for Spatial Reasoning Generation

Unleashing diffusion models' spatial reasoning potential is now possible without expensive joint training, thanks to a clever plug-and-play framework that leverages MLLMs for layout planning.

Wei Chen, Mingqiao Liu, Haojie Ding +5

Computer Vision Multimodal Models Reasoning & Chain-of-Thought

Search

Tingting Gao

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (9)