Yifan Zhou

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (2)Distributed Systems & Hardware (2)Inference & Quantization (1)

Frequent co-authors

Zizhuo Fu (1)Zhaoxin Lu (1)Guangyu Sun (1)Runsheng Wang (1)

Papers (2)

Apr 1, 2026

Zizhuo Fu +6Apr 1, 2026·also OPPO

RePart: Efficient Hypergraph Partitioning with Logic Replication Optimization for Multi-FPGA System

Multi-FPGA partitioning gets a 52% communication efficiency boost and 11x speedup by explicitly optimizing for network topology and logic replication, blowing away prior art.

Zizhuo Fu, Yifan Zhou, Zhaoxin Lu +4

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware

Feb 12, 2026

Feb 12, 2026·also PKU, Virginia Tech

Cachemir: Fully Homomorphic Encrypted Inference of Generative Large Language Model with KV Cache

Achieve practical FHE inference for Llama-3-8B with sub-100 second token generation by cleverly integrating KV caching, leaving prior art in the dust.

Ye Yu, Yifan Zhou, Pedro Soto

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

Yifan Zhou

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)