Yunhai Tong

Papers on Lattice

Total citations

Topics

h-index

Research focus

Multimodal Models (2)Computer Vision (2)Data Curation & Synthetic Data (1)Natural Language Processing (1)Eval Frameworks & Benchmarks (1)

Frequent co-authors

Chao Tang (1)Jianzong Wu (1)Qingyu Shi (1)Ye Tian (1)

Papers (3)

May 1, 2026

Chao Tang +7May 1, 2026

Towards Customized Multimodal Role-Play

Forget generic chatbots – now, with just 10 images and interaction examples, you can fine-tune a model to embody a specific character with a consistent persona, dialogue style, and visual identity across text and images.

Chao Tang, Jianzong Wu, Qingyu Shi +5

Data Curation & Synthetic Data Multimodal Models Natural Language Processing

Apr 2, 2026

Jiahao Meng +9Apr 2, 2026

VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification

Despite impressive headline scores, today's best video MLLMs can't reliably ground their answers in space and time, achieving <1% accuracy when required to identify the spatio-temporal evidence supporting their predictions.

Jiahao Meng, Tan Yue, Qi Xu +7

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Mar 19, 2026

Chaoyang Wang +8Mar 19, 2026·also D temporal RoPE applied on top of the, PKU

Rethinking Vector Field Learning for Generative Segmentation

Diffusion models can generate segmentations that rival discriminative methods, but only if you reshape their vector fields with a distance-aware correction term that combats gradient vanishing.

Chaoyang Wang, Chaoyang Wang, Yaobo Liang +6

Architecture Design (Transformers, SSMs, MoE)Computer Vision

Search

Yunhai Tong

Research focus

Frequent co-authors

Papers (3)