Weixian Lei

Papers on Lattice

Total citations

Topics

h-index

Research focus

Computer Vision (2)Multimodal Models (2)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Yan Fang (1)Mengcheng Lan (1)Zilong Huang (1)Yunqing Zhao (1)

Papers (2)

May 1, 2026

Yan Fang +9May 1, 2026·also ByteDance

Let ViT Speak: Generative Language-Image Pre-training

Ditch the complex multimodal pre-training pipelines: GenLIP proves a simple language modeling objective can effectively align vision encoders with LLMs, achieving strong performance with less data.

Yan Fang, Mengcheng Lan, Zilong Huang +7

Architecture Design (Transformers, SSMs, MoE)Computer Vision Multimodal Models

Apr 8, 2026

NUSApr 8, 2026·also Central South University, Tencent AI

FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching

Forget text-centric pipelines: FlowInOne achieves SOTA multimodal generation by unifying text, layouts, and instructions into a single visual flow, outperforming both open-source and commercial systems.

Junchao Yi, Weixian Lei, Qi Su +3

Computer Vision Multimodal Models

Search

Weixian Lei

Research focus

Frequent co-authors

Papers (2)