Libo Zhang

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (2)Computer Vision (1)Natural Language Processing (1)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Xin Gu (1)Bing Fan (1)Bing Fan (1)Jiali Yao (1)

Papers (2)

Feb 26, 2026

Xin Gu +132w ago

Towards Long-Form Spatio-Temporal Video Grounding

Existing spatio-temporal video grounding methods choke on long videos, but this new autoregressive transformer efficiently handles them by processing frames sequentially and using memory banks with selection strategies.

Xin Gu, Bing Fan, Bing Fan +11

Computer Vision Multimodal Models Natural Language Processing

Feb 17, 2026

Libo Zhang +33w ago

Sparrow: Text-Anchored Window Attention with Visual-Semantic Glimpsing for Speculative Decoding in Video LLMs

Sparrow unlocks 2.8x faster inference for Video LLMs on long videos by cleverly offloading visual computation to the target model using text-anchored attention and semantic-rich intermediate states.

Libo Zhang, Zhaoning Zhang, Wangyang Hong +1

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Multimodal Models

Search

Libo Zhang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)