Shaolin Zhu

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Natural Language Processing (3)Architecture Design (Transformers, SSMs, MoE) (2)Computer Vision (2)Multimodal Models (2)

Frequent co-authors

Tianyu Dong (3)Ningyuan Deng (2)Lijie Wen (2)Yangyang Liu (1)

Papers (4)

Jun 24, 2026

DAMOJun 24, 2026·also Astronautics WeChat AI, Fudan, NJU, Stevens +2

SARA: Unlocking Multilingual Knowledge in Mixture-of-Experts via Semantically Anchored Routing Alignment

SARA unlocks the potential of low-resource languages in multilingual models by aligning their expert routing with high-resource anchors, leading to measurable performance gains.

Tianyu Dong, Yangyang Liu, Jiang Zhou +7

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing

May 23, 2026

Tsinghua AIMay 23, 2026·also HKUST, TJU

VaaWIT: Visual-Aware Adaptation of Large Language Models for Multilingual Web Image Translation

LLMs can now translate text in web images with significantly improved accuracy and efficiency thanks to a novel visual-aware adaptation framework that bridges the gap between high-level semantics and fine-grained visual details.

Ronghao Chen, Ningyuan Deng, Huacan Wang +2

Computer Vision Multimodal Models Natural Language Processing

Tianyu Dong +1May 23, 2026

Mix-MoE: Improving Multilingual Machine Translation of Large Language Models through Mixed MoEs

LLMs can learn multilingual translation far more effectively by explicitly separating and routing language modeling and translation knowledge during fine-tuning.

Tianyu Dong, Shaolin Zhu

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Training Efficiency & Optimization

Apr 18, 2026

Apr 18, 2026·also Tsinghua AI, Baidu, RUC, SJTU +1

MNAFT: modality neuron-aware fine-tuning of multimodal large language models for image translation

Targeted neuron fine-tuning can unlock superior image translation capabilities in multimodal large language models, outperforming traditional methods by preserving pre-trained knowledge.

Ningyuan Deng, Tianyu Dong, Shaobo Wang +2

Computer Vision Multimodal Models

Search

Shaolin Zhu

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (4)