Cheng Li

Current vision-language models can *see* point cloud defects, but can't reliably *diagnose* them, highlighting a critical gap in grounded quality understanding.

Duanchu Wang, Cheng Li, Junjie Yang +4

Computer Vision Eval Frameworks & Benchmarks

May 26, 2026

2w ago·also USTC, ZJU

ForestHG-Trace: Traceable Long-Horizon Ecological Reasoning over Large-Scale Forest Scenes

LLMs can now perform traceable, multi-step ecological reasoning over complex forest environments by operating on ecological hypergraphs and invoking deterministic tools, achieving higher accuracy and faithfulness than single-step approaches.

Zihang Cheng, Duanchu Wang, Cheng Li +2

Computer Vision Multimodal Models Reasoning & Chain-of-Thought

May 25, 2026

2w ago·also CAS, USTC, Xidian, ZJU

VertiCue-Bench: Diagnosing Whether MLLMs Use Height Cues to Resolve 2D Ambiguity in Remote Sensing Natural Scenes

Despite showing promise in reading raw height data, today's MLLMs often fail to translate geometric perception into reliable semantic reasoning about natural scenes, even performing worse than RGB-only models when both modalities are needed.

Duanchu Wang, Junjie Yang, Zihang Cheng +4

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Apr 20, 2026

University of Science and TechnologyApr 20, 2026·also CUHK, Hefei Comprehensive National Science, Independent Researcher, UMacau +1

AdaCluster: Adaptive Query-Key Clustering for Sparse Attention in Video Generation

Get up to 4x faster video generation from diffusion transformers without sacrificing quality, thanks to a new clustering method that slashes attention costs.

Haoyue Tan, Shengnan Wang, Yulin Qiao +5

Architecture Design (Transformers, SSMs, MoE)Computer Vision

Search

Cheng Li

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (5)