Mohammed Sabry

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Inference & Quantization (1)Training Efficiency & Optimization (1)

Frequent co-authors

Anya Belz (1)

Papers (1)

May 5, 2026

Mohammed Sabry +12w ago

Budgeted LoRA: Distillation as Structured Compute Allocation for Efficient Inference

Get 4x faster LLM inference with Budgeted LoRA, which smartly redistributes compute between dense and low-rank pathways during distillation, outperforming standard LoRA in both speed and function-style in-context learning.

Mohammed Sabry, Anya Belz

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization Training Efficiency & Optimization

Search

Mohammed Sabry

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)