N. Jayasena

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)

Frequent co-authors

Vignesh Adhinarayanan (1)

Papers (1)

Mar 9, 2026

Vignesh Adhinarayanan +11w ago

The $qs$ Inequality: Quantifying the Double Penalty of Mixture-of-Experts at Inference

MoE models, despite their training efficiency, can be structurally 4.5x slower than quality-matched dense models at inference due to memory fragmentation, especially in long-context scenarios.

Vignesh Adhinarayanan, N. Jayasena

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Search

N. Jayasena

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)