Jing Lyu

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Computer Vision (4)Multimodal Models (4)Eval Frameworks & Benchmarks (2)World Models & Planning (1)

Frequent co-authors

Fengyun Rao (2)Shengjun Zhang (1)Zhang Zhang (1)Simin Huang (1)

Papers (5)

May 30, 2026

Shengjun Zhang +133w ago

MBench: A Comprehensive Benchmark on Memory Capability for Video World Models

Existing video world models struggle with long-term memory retention, and MBench exposes their critical limitations while providing a structured path for future improvements.

Shengjun Zhang, Zhang Zhang, Simin Huang +11

Eval Frameworks & Benchmarks World Models & Planning

May 26, 2026

3w ago·also HKUST, Tencent AI

REVERSE: Reinforcing Evidence Verification and Search for Agentic Image geo-localization

Forget brute-force scaling: REVERSE shows that teaching an agent *how* to search and verify evidence lets a smaller model beat giants at image geo-localization.

Furong Jia, Dacheng Yin, Kang Rong +2

Computer Vision Multimodal Models Tool Use & Agents

May 25, 2026

3w ago·also Tencent AI

DRM: Diffusion-based Reward Model With Step-wise Guidance

Aligning diffusion models with human preferences just got a fidelity upgrade: DRM leverages the generative backbone itself for rewards, unlocking step-wise guidance that boosts image quality.

Jaxon Zhang, Binxin Yang, Hubery Yin +1

Computer Vision Multimodal Models RLHF & Preference Learning

May 18, 2026

May 18, 2026·also PKU, Tencent AI

OmniPro: A Comprehensive Benchmark for Omni-Proactive Streaming Video Understanding

Current video understanding models struggle with long-horizon robustness and non-speech audio, as revealed by the new OmniPro benchmark designed for comprehensive omni-modal proactive evaluation.

Ruixiang Zhao, Jie Yang, Zijie Xin +4

Computer Vision Eval Frameworks & Benchmarks Multimodal Models+1

Apr 2, 2026

Dingming Liu +3Apr 2, 2026

From Understanding to Erasing: Towards Complete and Stable Video Object Removal

Removing objects from video now means removing their shadows and reflections too, thanks to a new method that teaches diffusion models to "understand" object-scene physics.

Dingming Liu, Wenjing Wang, Chen Li +1

Computer Vision Multimodal Models