Jing Li

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (3)Computer Vision (2)Natural Language Processing (2)Code Generation & Program Synthesis (1)

Frequent co-authors

Longhui Zhang (2)Jiahao Wang (2)Chenhao Hu (2)Bingyu Liang (2)

Papers (5)

Jun 16, 2026

Longhui Zhang +52d ago

Bridging Functional Correctness and Runtime Efficiency Gaps in LLM-Based Code Translation

LLM-translated code can be slower than human-written code, but SwiftTrans bridges this gap with a novel two-stage framework that boosts both correctness and efficiency.

Longhui Zhang, Jiahao Wang, Chenhao Hu +3

Code Generation & Program Synthesis

Jiahao Wang +72d ago

SuCo: Sufficiency-guided Continuous Adaptive Reasoning

Reducing reasoning tokens while boosting accuracy, SuCo transforms how LRMs approach problem-solving by focusing on sufficiency rather than excess.

Jiahao Wang, Bingyu Liang, Chenhao Hu +5

Reasoning & Chain-of-Thought

Jinghan Wu +32d ago

Plug-and-Adapt: Multimodal Coreference Resolution at First Sight with a Pretrained Alignment Model

Achieving over 5% improvement in coreference resolution without the need for resource-intensive training on target datasets could revolutionize multimodal AI applications.

Jinghan Wu, Jing Li, Ivor W. Tsang +1

Computer Vision Multimodal Models Natural Language Processing

Jun 12, 2026

Haonan Qi +116d ago

IndustryBench-MIPU: Benchmarking Multi-Image Attribute Value Extraction for Industrial Products

MLLMs excel at precision but falter dramatically in extracting complete product specifications from multiple images, with a mere 49.9% recovery rate.

Haonan Qi, Jin Cao, Yongqi Zhang +9

Eval Frameworks & Benchmarks Multimodal Models Natural Language Processing

Jun 11, 2026

Yang Zhou +71w ago·also HUJING Digital Media & Entertainment, Orange Team, Youku Moku-Lab

MoVerse: Real-Time Video World Modeling with Panoramic Gaussian Scaffold

Transforming a single image into a fully navigable 3D world in real-time could revolutionize how we interact with visual environments.