Peiran Yin

Ningbo Institute of Digital Twin, Eastern Institute of Technology

Papers on Lattice

Total citations

Topics

Publication activitypapers/week, last 8 weeks

Research focus

Inference & Quantization (2)Eval Frameworks & Benchmarks (1)Scaling Laws & Emergent Abilities (1)Architecture Design (Transformers, SSMs, MoE) (1)

Frequent co-authors

Haozhe Hu (1)Anhao Zhao (1)Longwei Ding (1)Yunpu Ma (1)

Papers (2)

Jun 8, 2026

Ningbo Institute of Digital Twin1w ago·also Eastern Institute of Technology, LMU, PolyU

Beyond FLOPs: Benchmarking Real Inference Acceleration of LLM Pruning under a GEMM-Centric Taxonomy

Static depth pruning emerges as the most effective strategy for LLM acceleration, achieving near-theoretical speedup limits in memory-bounded contexts.

Haozhe Hu, Anhao Zhao, Longwei Ding +2

Eval Frameworks & Benchmarks Inference & Quantization Scaling Laws & Emergent Abilities

Mar 4, 2026

Mar 4, 2026·also Eastern Institute of Technology, Ningbo Institute of Digital Twin

From Static Inference to Dynamic Interaction: Navigating the Landscape of Streaming Large Language Models

Untangling the mess of "streaming LLMs," this paper delivers a clear taxonomy that distinguishes between streaming generation, streaming inputs, and interactive architectures.

Zilong Wang, YuJie Ren, Peiran Yin +2

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Natural Language Processing

Search

Peiran Yin

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (2)