Qi Dai

Microsoft Research Asia

Microsoft Research

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Tool Use & Agents (4)Multimodal Models (3)Eval Frameworks & Benchmarks (2)Computer Vision (1)

Frequent co-authors

Chong Luo (6)Kai Qiu (4)Yifan Yang (4)Bei Liu (3)

Papers (6)

Jun 10, 2026

Microsoft Research1w ago·also SNU, University of Science and Technology

A Comprehensive Ecosystem for Open-Domain Customized Video Generation

A million-scale dataset for identity-preserving video generation enables a new benchmark that outperforms existing models with minimal parameter overhead.

Jingxu Zhang, Yuqian Hong, Daneul Kim +4

Computer Vision Data Curation & Synthetic Data Multimodal Models

1w ago·also Microsoft Research

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Arbor's innovative approach to autonomous research enables a cumulative learning process that outperforms existing models by over 2.5 times in real-world tasks.

Jiajie Jin, Yuyang Hu, Kai Qiu +14

Scientific Discovery & Drug Design Tool Use & Agents

May 22, 2026

3w ago·also Microsoft Research, M QA pairs over more than, SJTU

From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills

Model-generated skills can actually hurt agent performance, and bigger models don't necessarily make for better skill extractors or consumers.

Zisu Huang, Jingwen Xu, Yifan Yang +13

Eval Frameworks & Benchmarks Tool Use & Agents

Microsoft Research3w ago·also Fudan, M QA pairs over more than, SJTU, Tongji

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

SkillOpt transforms agent skill development into a reproducible optimization process, achieving state-of-the-art results by treating skills as trainable parameters.

Yifan Yang, Ziyang Gong, Weiquan Huang +12

Natural Language Processing Tool Use & Agents Training Efficiency & Optimization

Apr 16, 2026

Zezi Zeng +13Apr 16, 2026·also Microsoft Research

MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation

Hierarchical planning and self-reflection can finally wrangle AIGC tools into producing coherent, visually consistent webpages.

Zezi Zeng, Yifan Yang, Yuqing Yang +11

Code Generation & Program Synthesis Multimodal Models Tool Use & Agents

Apr 9, 2026

Ziwei Zhou +13Apr 9, 2026·also Microsoft Research

AVGen-Bench: A Task-Driven Benchmark for Multi-Granular Evaluation of Text-to-Audio-Video Generation

Today's best text-to-audio-video models may look and sound impressive, but they still struggle with basic physics, coherent speech, and even rendering text correctly.

Ziwei Zhou, Ziwei Zhou, Zeyuan Lai +11

Eval Frameworks & Benchmarks Multimodal Models Speech & Audio

Search

Qi Dai

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (6)