Zicheng He

University of California, Los Angeles

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Anhao Zhao (1)Xiaoyu Shen (1)Chen Wu (1)Lei He (1)

Papers (1)

Mar 16, 2026

2w ago·also Eastern Institute of Technology, Institute of Digital Twin, Ningbo Key Laboratory of Spatial

SkipOPU: An FPGA-based Overlay Processor for Large Language Models with Dynamically Allocated Computation

FPGAs can beat GPUs at dynamically allocating computation for LLM inference, thanks to a new architecture that fuses operations, uses mixed precision, and caches KV values on-chip.

Zicheng He, Anhao Zhao, Xiaoyu Shen +2

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

Zicheng He

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)