Li Dong

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (4)RLHF & Preference Learning (2)Architecture Design (Transformers, SSMs, MoE) (2)Inference & Quantization (2)

Frequent co-authors

Furu Wei (3)Jianyong Wang (2)Tianzhu Ye (2)Shaohan Huang (2)

Papers (6)

Jul 6, 2026

Microsoft Research1w ago·also aka.ms, GeneralAI, UvA

Multi-Turn On-Policy Distillation with Prefix Replay

ReOPD turns the costly process of agent-environment interaction into a reusable offline resource, achieving faster training while preserving accuracy.

Baohao Liao, Hanze Dong, Christof Monz +4

RLHF & Preference Learning Tool Use & Agents

Jun 9, 2026

Tengchao Lv +11Jun 9, 2026·also Microsoft Research

Mind the Gap: Can Frontier LLMs Pass a Standardized Office Proficiency Exam?

LLMs struggle with Office automation, scoring only 36.6% on a standardized proficiency exam, revealing a critical gap in their capabilities.

Tengchao Lv, Dongdong Zhang, Jiayu Ding +9

Eval Frameworks & Benchmarks Tool Use & Agents

Jun 4, 2026

Microsoft ResearchJun 4, 2026

You Only Index Once: Cross-Layer Sparse Attention with Shared Routing

Achieving up to 7.6x faster decoding and 17.1x greater throughput, CLSA redefines efficiency in long-context LLMs without compromising accuracy.

Yutao Sun, Yanqi Zhang, Li Dong +2

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Reasoning & Chain-of-Thought

Apr 1, 2026

Shaopeng Fu +2Apr 1, 2026·also Tsinghua AI, NUDT

RefineRL: Advancing Competitive Programming with Self-Refinement Reinforcement Learning

Forget scaling model size: RefineRL shows that incentivizing self-refinement in smaller LLMs lets them punch *way* above their weight, rivaling models 10x larger on competitive programming tasks.

Shaopeng Fu, Xingxing Zhang, Li Dong

Code Generation & Program Synthesis Reasoning & Chain-of-Thought Tool Use & Agents

Yutao Sun +5Apr 1, 2026

Universal YOCO for Efficient Depth Scaling

By cleverly combining YOCO's efficient attention with recursive computation, YOCO-U achieves a capability-efficiency sweet spot that neither technique can reach on its own.

Yutao Sun, Li Dong, Tianzhu Ye +3

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Training Efficiency & Optimization

Mar 17, 2026

Tianzhu Ye +8Mar 17, 2026·also BIT, Qinzheng Sun1

Online Experiential Learning for Language Models

Language models can learn directly from real-world user interactions, boosting performance without human annotations or simulated environments.

Tianzhu Ye, Li Dong, Li Dong +6

Natural Language Processing RLHF & Preference Learning Tool Use & Agents

Search

Li Dong

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (6)