Lei He

Papers on Lattice

Total citations

Topics

h-index

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Zicheng He (1)Anhao Zhao (1)Xiaoyu Shen (1)

Papers (1)

Mar 16, 2026

Mar 16, 2026·also Institute of Digital Twin

SkipOPU: An FPGA-based Overlay Processor for Large Language Models with Dynamically Allocated Computation

FPGAs can beat GPUs at dynamically allocating computation for LLM inference, thanks to a new architecture that fuses operations, uses mixed precision, and caches KV values on-chip.

Zicheng He, Anhao Zhao, Xiaoyu Shen +1

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

Lei He

Research focus

Frequent co-authors

Papers (1)