Baiyang Song

Key Laboratory of Multimedia Trusted Perception and Efficient Computing

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Computer Vision (2)Multimodal Models (2)

Frequent co-authors

Tao Chen (2)Yiyi Zhou (2)Rongrong Ji (2)Yuli Lin (1)

Papers (2)

Jun 24, 2026

Key Laboratory of Multimedia Trusted4d ago

Towards a Dynamic and Fixed-budget Memory Bank for Efficient Streaming Video Understanding

CausalMem achieves over 20x visual token compression while maintaining high accuracy in streaming video understanding, redefining memory efficiency in MLLMs.

Baiyang Song, Yuli Lin, Qiong Wu +5

Computer Vision Multimodal Models

Jun 23, 2026

Kun Zhang +65d ago·also Key Laboratory of Multimedia Trusted, Xiamen University

Towards Fast and Effective Long Video Understanding of Multimodal Large Language Models via Adaptive Quasi-Gaussian Sampling

AdaQ enables MLLMs to achieve superior long video understanding with just 64 frames, outperforming state-of-the-art methods by a striking margin.

Kun Zhang, Chenxin Fang, Tao Chen +4