Yang Zhang

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Constitutional AI & AI Ethics (1)Red-Teaming & Adversarial Robustness (1)RLHF & Preference Learning (1)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Rui Zhang (1)Hongwei Li (1)Yun Shen (1)Xinyue Shen (1)

Papers (2)

Apr 9, 2026

Tsinghua AI2w ago

The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training

LLM safety is a cat-and-mouse game: ORPO excels at breaking alignment, while DPO is best at restoring it, but at the cost of overall usefulness.

Rui Zhang, Hongwei Li, Yun Shen +6

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness RLHF & Preference Learning

Haolei Xu +112w ago·also School of Traffic & Transportation Engineering

Seeing but Not Thinking: Routing Distraction in Multimodal Mixture-of-Experts

Multimodal models can "see" the image but still fail at reasoning because the visual input distracts the routing mechanism from activating the right experts.

Haolei Xu, Haiwen Hong, Haiwen Hong +9

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Reasoning & Chain-of-Thought

Search

Yang Zhang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)