Jian Luan

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Multimodal Models (4)Reasoning & Chain-of-Thought (3)Tool Use & Agents (2)Computer Vision (2)

Frequent co-authors

Jianzhong Ju (3)Yiran Guan (2)Dingkang Liang (2)Yuliang Liu (2)

Papers (4)

Apr 15, 2026

Yuanlei Zheng +92w ago·also BIT

Doc-V*:Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA

Doc-V* demonstrates that an agentic approach to multi-page document VQA, using active navigation and structured memory, can significantly outperform retrieval-augmented generation, especially in out-of-domain scenarios.

Yuanlei Zheng, Pei Fu, Hang Li +7

Multimodal Models Reasoning & Chain-of-Thought Tool Use & Agents

Mar 12, 2026

Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

VideoLLMs can now watch and think *simultaneously*, achieving 15x faster response times and improved accuracy on video understanding tasks.

Yiran Guan, Yiran Guan, Liang Yin +7

Computer Vision Multimodal Models Reasoning & Chain-of-Thought

Feb 26, 2026

MSJoE: Jointly Evolving MLLM and Sampler for Efficient Long-Form Video Understanding

By jointly training a keyframe sampler with an MLLM, MSJoE achieves state-of-the-art accuracy in long-form video understanding while significantly reducing computational cost.

Wenhui Tan, Xiaoyi Yu, Xiaoyi Yu +8

Computer Vision Multimodal Models Training Efficiency & Optimization

Feb 26, 2026

ThinkOmni: Lifting Textual Reasoning to Omni-modal Scenarios via Guidance Decoding

Unleashing powerful reasoning in OLLMs doesn't require expensive training data or compute – just clever guidance from existing Large Reasoning Models.

Yiran Guan, Sifan Tu, Sifan Tu +7