Mingyang Song

Papers on Lattice

Total citations

Topics

h-index

Research focus

Training Efficiency & Optimization (3)Inference & Quantization (2)RLHF & Preference Learning (2)Natural Language Processing (2)Data Curation & Synthetic Data (1)

Frequent co-authors

Mao Zheng (4)Chenning Xu (2)Shuang Chen (1)Quanxin Shou (1)

Papers (5)

Apr 2, 2026

NUSApr 2, 2026·also CAS, Southwest U, Tencent AI

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Achieve the best of both worlds in LLM policy optimization: SRPO combines the rapid gains of self-distillation with the long-term stability of group-relative methods, outperforming both by adaptively routing samples.

Mingyang Song, Mao Zheng

Inference & Quantization RLHF & Preference Learning Training Efficiency & Optimization

Chenning Xu +2Apr 2, 2026·also Southwest U

PRISM: Probability Reallocation with In-Span Masking for Knowledge-Sensitive Alignment

Factually dubious LLM outputs can be tamed by strategically penalizing high-confidence predictions at "risky" tokens during fine-tuning, guided by sentence-level factuality labels.

Chenning Xu, Mao Zheng, Mingyang Song

Data Curation & Synthetic Data Natural Language Processing Training Efficiency & Optimization

Apr 1, 2026

Mingyang Song +1Apr 1, 2026·also Southwest U

A Survey of On-Policy Distillation for Large Language Models

On-Policy Distillation could be the key to more robust and reliable LLM knowledge transfer, but the field is fragmented and lacks a unified theoretical understanding.

Mingyang Song, Mao Zheng

Inference & Quantization Natural Language Processing Training Efficiency & Optimization

Mar 31, 2026

Shuang Chen +18Mar 31, 2026·also Meituan, Tencent AI

Unify-Agent: A Unified Multimodal Agent for World-Grounded Image Synthesis

By tightly coupling reasoning, searching, and generation, Unify-Agent demonstrates that agent-based modeling can substantially improve world knowledge grounding in image synthesis, rivaling closed-source models.

Shuang Chen, Quanxin Shou, Hangting Chen +16

Computer Vision Multimodal Models Tool Use & Agents

Mar 11, 2026

Mingyang Song +2Mar 11, 2026·also Southwest U

Beyond the Illusion of Consensus: From Surface Heuristics to Knowledge-Grounded Evaluation in LLM-as-a-Judge

LLM-as-a-judge consensus is often an illusion: models agree on surface-level features, but diverge wildly when evaluating true quality, a problem fixable by injecting domain knowledge into rubrics.

Mingyang Song, Mao Zheng, Chenning Xu

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks RLHF & Preference Learning

Search

Mingyang Song

Research focus

Frequent co-authors

Papers (5)