Jinpeng Wang

Harbin Institute of Technology

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (5)Tool Use & Agents (4)RLHF & Preference Learning (4)Natural Language Processing (3)

Frequent co-authors

Shu-Tao Xia (3)Yaowei Wang (2)Haonan Fan (2)Kaiyu Jiang (2)

Papers (9)

Jul 5, 2026

Tsinghua AI3w ago·also Graduate School, HIT, Peng Cheng Laboratory, ZJU

UI-MOPD: Multi-Platform On-Policy Distillation for Continual GUI Agent Learning

UI-MOPD achieves a remarkable balance between retaining existing capabilities and adapting to new platforms, with task success rates that challenge conventional approaches in GUI agent learning.

Niu Lian, Alan Chen, Zhehao Yu +8

Multimodal Models Tool Use & Agents

Jun 10, 2026

Jun 10, 2026·also HIT, HKU

Beyond Fully Random Masking: Attention-Guided Denoising and Optimization for Diffusion Language Models

Attention-guided denoising can dramatically enhance reasoning performance in diffusion language models, outperforming traditional post-training methods.

Jia Deng, Junyi Li, Jinpeng Wang +2

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Reasoning & Chain-of-Thought

May 27, 2026

May 27, 2026·also Tsinghua AI, CAS, Kuaishou

VCap: Hypergeometric Rewards for Weak-to-Strong Visual Captioning

Key contribution not extracted.

Jinpeng Wang, Yankai Yang, Yancheng Long +8

Computer Vision Multimodal Models RLHF & Preference Learning

May 21, 2026

May 21, 2026·also HIT, Jilin, Meituan, Peng Cheng Laboratory

SegCompass: Exploring Interpretable Alignment with Sparse Autoencoders for Enhanced Reasoning Segmentation

Unlock "white-box" reasoning in vision-language models: SegCompass's sparse autoencoder creates an interpretable bridge between visual perception and chain-of-thought, outperforming black-box alignment methods.

Zhenyu Lu, Liupeng Li, Jinpeng Wang +3

Interpretability & Mechanistic Interp Multimodal Models Reasoning & Chain-of-Thought

Mar 19, 2026

Tianci Luo +8Mar 19, 2026·also Tsinghua AI, Graduate School, HIT, Peng Cheng Laboratory

PromptHub: Enhancing Multi-Prompt Visual In-Context Learning with Locality-Aware Fusion, Concentration and Alignment

Spatial awareness is the secret ingredient to unlocking better visual in-context learning, boosting performance across diverse vision tasks.

Tianci Luo, Jinpeng Wang, Shi-Yu Qin +6

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Mar 2, 2026

Mar 2, 2026·also Tsinghua AI, Graduate School, HIT, Peng Cheng Laboratory

From Verbatim to Gist: Distilling Pyramidal Multimodal Memory via Semantic Information Bottleneck for Long-Horizon Video Agents

Achieve state-of-the-art long-horizon video understanding by compressing multimodal memories into high-level semantic schemas, enabling efficient reasoning without losing crucial details.

Niu Lian, Hanshu Yao, Hanshu Yao +5

Computer Vision Multimodal Models Tool Use & Agents

Feb 26, 2026

Feb 26, 2026·also HIT, Meituan, Northeastern

Reinforcing Real-world Service Agents: Balancing Utility and Cost in Task-oriented Dialogue

Task-oriented dialogue agents can now learn to balance user satisfaction and operational costs, thanks to a new RL framework that optimizes for both.

Ninghang Gao, Yuqing Dai, Yuqin Dai +7

Natural Language Processing RLHF & Preference Learning Tool Use & Agents

Feb 26, 2026·also OpenAI, Tsinghua AI, CAS, China Academy of Space Technology +2

ContextRL: Enhancing MLLM's Knowledge Discovery Efficiency with Context-Augmented RL

Context-augmented RL lets smaller MLLMs punch *way* above their weight, rivaling much larger models on reasoning tasks while dodging reward hacking.

Jinpeng Wang, Jinpeng Wang, Yifan Zhang +16

Reasoning & Chain-of-Thought RLHF & Preference Learning Tool Use & Agents

Feb 19, 2026

Feb 19, 2026·also HIT, University of Louisiana at Lafayette

Improving LLM-based Recommendation with Self-Hard Negatives from Intermediate Layers

LLMs learn to recommend better by looking inside themselves, using intermediate layer activations to generate harder negatives on the fly.

Bingqian Li, Xiaolei Wang, Jinpeng Wang +1

Natural Language Processing Recommendation & Information Retrieval RLHF & Preference Learning

Search

Jinpeng Wang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (9)