Xinpeng Ding

Xidian University

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Computer Vision (3)Multimodal Models (3)Architecture Design (Transformers, SSMs, MoE) (2)Inference & Quantization (1)

Frequent co-authors

Bingjun Luo (2)Tony Wang (2)Hanqi Chen (1)Chaoqi Chen (1)

Papers (3)

May 21, 2026

Tsinghua AIMay 21, 2026·also Shenzhen University, Xidian

Enhancing Visual Token Representations for Video Large Language Models via Training-Free Spatial-Temporal Pooling and Gridding

Video LLMs can get a free performance boost by using ST-GridPool, a novel technique that enhances visual token representations without any additional training.

Bingjun Luo, Tony Wang, Hanqi Chen +1

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Tsinghua AIMay 21, 2026·also Shenzhen University, Xidian

ST-SimDiff: Balancing Spatiotemporal Similarity and Difference for Efficient Video Understanding with MLLMs

Instead of just pruning redundant tokens, ST-SimDiff dramatically cuts MLLM video processing costs by intelligently preserving tokens representing *changes* in the video.

Bingjun Luo, Tony Wang, Chaoqi Chen +1

Computer Vision Inference & Quantization Multimodal Models

Feb 15, 2026

Feb 15, 2026·also Xidian

DenseMLLM: Standard Multimodal LLMs are Intrinsic Dense Predictors

Standard multimodal LLMs can perform surprisingly well on dense prediction tasks like segmentation and depth estimation, without needing any task-specific decoder modules.

Hongze Shen, Lexiang Tang, Xinpeng Ding +2