Dongmei Zhang

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Eval Frameworks & Benchmarks (4)Natural Language Processing (4)Tool Use & Agents (4)Recommendation & Information Retrieval (3)

Frequent co-authors

Qingwei Lin (8)S. Rajmohan (7)Saravan Rajmohan (4)Saravan Rajmohan (3)

Papers (9)

Apr 23, 2026

Wenjie Fu +73d ago

CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents

Enterprise LLM agents leak sensitive information in up to 50% of interactions, and surprisingly, performing better at tasks makes the problem *worse*.

Wenjie Fu, Xiaoting Qin, Jue Zhang +5

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Apr 20, 2026

Microsoft Research6d ago

Contrastive Attribution in the Wild: An Interpretability Analysis of LLM Failures on Realistic Benchmarks

Token-level attribution struggles to pinpoint the causes of LLM failures in realistic settings, suggesting current interpretability tools may not be up to the task of debugging complex model behaviors.

Rongyuan Tan, Jue Zhang, Zhuozhao Li +4

Eval Frameworks & Benchmarks Interpretability & Mechanistic Interp Natural Language Processing

Apr 15, 2026

1w ago·also Microsoft Research, MBZUAI, PKU

Beyond State Consistency: Behavior Consistency in Text-Based World Models

Stop obsessing over state prediction accuracy in text-based world models: aligning them with *behavior* yields better long-term planning and evaluation.

Youling Huang, Guanqiao Chen, Junchi Yao +8

Eval Frameworks & Benchmarks Tool Use & Agents World Models & Planning

1w ago·also Microsoft Research, KTH, SEU, Shenzhen University +1

DUET: Joint Exploration of User Item Profiles in Recommendation System

Forget hand-crafted templates: DUET learns to generate user and item profiles jointly, boosting recommendation accuracy by better aligning textual representations.

Yue Chen, Yifei Sun, Yifei Sun +23

Natural Language Processing Recommendation & Information Retrieval

Apr 14, 2026

Microsoft Research1w ago·also Virginia Tech

WebXSkill: Skill Learning for Autonomous Web Agents

Autonomous web agents get a serious upgrade with WebXSkill, which lets them learn and execute skills with both code-level precision and human-readable guidance.

Zhaoyang Wang, Qianhui Wu, Xuchao Zhang +18

Code Generation & Program Synthesis Natural Language Processing Tool Use & Agents

Apr 9, 2026

Microsoft Research2w ago·also Georgia Tech, Virginia Tech

ORACLE-SWE: Quantifying the Contribution of Oracle Information Signals on SWE Agents

Knowing the *perfect* API to use or *exact* location to edit could drastically improve SWE agent performance, but knowing the perfect regression test result? Not so much.

Kenan Li, Qirui Jin, Liao Zhu +16

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Tool Use & Agents

Apr 7, 2026

Lihao Sun +52w ago

LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals

LLMs don't learn fundamentally new reasoning representations during training; they just get faster at converging to the right answer.

Lihao Sun, Hang Dong, Bo Qiao +3

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought

Feb 19, 2026

Microsoft ResearchFeb 19, 2026·also Northeastern

Computer-Using World Model

World models can now effectively simulate complex desktop software environments like Microsoft Office, enabling agents to reason about actions before execution and significantly improving performance.

Yiming Guan, Rui Yu, John Zhang +23

Tool Use & Agents World Models & Planning

Feb 2, 2026

Microsoft ResearchFeb 2, 2026·also Bing Ads, Kuaishou, Washington State

AdNanny: One Reasoning LLM for All Offline Ads Recommendation Tasks

Ditch the army of task-specific models: AdNanny shows a single, reasoning-centric LLM can handle diverse offline advertising tasks with improved accuracy and reduced manual effort.

Nan Hu, Han Li, Jimeng Sun +16

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Search

Dongmei Zhang

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (9)