Wenyu Zhao

Microsoft, Beijing, China

Microsoft Research

Papers on Lattice

Total citations

Topics

h-index

Research focus

Distributed Systems & Hardware (1)Inference & Quantization (1)Recommendation & Information Retrieval (1)

Frequent co-authors

Zijian Shen (1)Boyuan Wang (1)Zimeng Wang (1)Wenbin Shang (1)

Papers (1)

Mar 9, 2026

CMU MLMar 9, 2026·also Microsoft Research, Brandeis, Glasgow, USC

CAGR: A Cross-Accelerator Graph Optimization Framework for Efficient Recommender System Inference

Achieve near-optimal DLRM inference speedups across diverse hardware (NVIDIA, AMD, TPU) with a single optimization pass, eliminating the need for vendor-specific tuning.

Zijian Shen, Wenyu Zhao, Boyuan Wang +2

Distributed Systems & Hardware Inference & Quantization Recommendation & Information Retrieval

Search

Wenyu Zhao

Research focus

Frequent co-authors

Papers (1)