Jing Xiong

Shanghai AI Laboratory

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Inference & Quantization (3)Multimodal Models (2)Natural Language Processing (2)Eval Frameworks & Benchmarks (2)

Frequent co-authors

Ngai Wong (3)Zhongwei Wan (3)Qi Han (2)Yunta Hsieh (2)

Papers (7)

Jun 18, 2026

University of Science and Technology3d ago·also Tsinghua AI, AI Laboratory, DUT, HKU +5

EventVLA: Event-Driven Visual Evidence Memory for Long-Horizon Vision-Language-Action Policies

EventVLA's foresight-driven memory mechanism boosts robotic manipulation success rates by 40% by dynamically capturing critical visual events before they become unobservable.

Ganlin Yang, Zhangzheng Tu, Yuqiang Yang +10

Multimodal Models Robotics & Embodied AI

Jun 9, 2026

1w ago·also HKU, LMSYS, UMich

Prefilling-dLLM: Predictive Prefilling for Long-Context Inference in Diffusion Language Models

Sparse prefilling can dramatically accelerate long-context inference in diffusion language models, achieving up to 28x speedup without sacrificing quality.

Jing Xiong, Qi Han, Shansan Gong +5

Inference & Quantization Natural Language Processing Scaling Laws & Emergent Abilities

Apr 20, 2026

Artificial Intelligence LaboratoryApr 20, 2026·also Fudan, HKU, HKUST, NJU +4

MedProbeBench: Systematic Benchmarking at Deep Evidence Integration for Expert-level Medical Guideline

LLMs are still far from being able to generate expert-level clinical guidelines, despite advances in deep research systems.

Jiyao Liu, Jianghan Shen, Sida Song +18

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Apr 11, 2026

Tsinghua AIApr 11, 2026·also Bosch AI, HKU, LongCat Team, Ohio State +4

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Attention Sink, where Transformers fixate on seemingly irrelevant tokens, is more than just a quirk – it's a fundamental challenge impacting training, inference, and even causing hallucinations, demanding a systematic approach to understanding and mitigating its effects.

Zunhai Su, Hengyuan Zhang, Yifan Zhang +12

Architecture Design (Transformers, SSMs, MoE)Interpretability & Mechanistic Interp Natural Language Processing

Mar 18, 2026

Mar 18, 2026·also Tsinghua AI, HKU, LongCat Team, PKU +2

Beyond Outliers: A Data-Free Layer-wise Mixed-Precision Quantization Approach Driven by Numerical and Structural Dual-Sensitivity

Achieve better compression in low-bit quantization by considering not just numerical sensitivity, but also the structural role of each layer.

Hengyuan Zhang, Xinrong Chen, Zunhai Su +7

Inference & Quantization Training Efficiency & Optimization

Mar 16, 2026

Mar 16, 2026·also B. Topic Samples Data source(s), HKU, Ohio State, PKU +1

MMSpec: Benchmarking Speculative Decoding for Vision-Language Models

Text-based speculative decoding falls flat for vision-language models, but ViSkip dynamically adapts to vision tokens for state-of-the-art acceleration.

Yunta Hsieh, Qi Han, Zhongwei Wan +4

Eval Frameworks & Benchmarks Inference & Quantization Multimodal Models

Feb 23, 2026

Feb 23, 2026·also HKU, HUST, Imperial, Shanghai AI Lab +4

DSDR: Dual-Scale Diversity Regularization for Exploration in LLM Reasoning

LLMs can reason better if you force them to explore *different* ways of being right, not just be more random.

Zhongwei Wan, Zhongwei Wan, Zhihao Dou +8

Reasoning & Chain-of-Thought RLHF & Preference Learning

Search

Jing Xiong

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (7)