Latticethe structure behind the noise

Papers Digest Topics Selected Labs Collections FAQ

Created by Flynn Lachendro

Papers Digest Topics Labs Saved

Search

Search papers, labs, and topics across Lattice.

Built by Flynn Lachendro·𝕏 / Twitter·RSS··FAQ·Glossary·Privacy

Yusheng Zheng | Lattice

Yusheng Zheng

University of California, Santa Cruz

Papers on Lattice

3

Total citations

0

Topics

4

h-index

4

Research focus

Distributed Systems & Hardware (3)Training Efficiency & Optimization (2)Inference & Quantization (1)Code Generation & Program Synthesis (1)

Frequent co-authors

Andi Quinn (2)Yuhang Gan (1)Wenan Mao (1)Shuyi Cheng (1)

Papers (3)

Apr 20, 2026

University of California Santa CruzApr 20, 2026·also BAIR, UW, UCSC

GPUOS: A GPU Operating System Primitive for Transparent Operation Fusion

Kernel launch overhead is a bigger bottleneck than you think: GPUOS achieves up to 15.3x speedup by fusing operations at runtime.

Yuhang Gan, Yusheng Zheng, Andi Quinn

Distributed Systems & Hardware Inference & Quantization Training Efficiency & Optimization

Mar 31, 2026

Mar 31, 2026

SysOM-AI: Continuous Cross-Layer Performance Diagnosis for Production AI Training

Diagnose performance bottlenecks in large-scale AI training 100x faster with a new observability system that adds almost no overhead.

Yusheng Zheng, Wenan Mao, Shuyi Cheng +9

Distributed Systems & Hardware Training Efficiency & Optimization

Mar 12, 2026

Mar 12, 2026

NCCLbpf: Verified, Composable Policy Execution for GPU Collective Communication

Hot-patching NCCL with eBPF lets you boost AllReduce throughput by 27% *and* verify plugin safety, all without modifying NCCL itself.

Code Generation & Program Synthesis Distributed Systems & Hardware

Guangshui Li (1)

Zhaoyan Liao (1)

Yongzhuo Huang (1)