Aditya Ukarande

Papers on Lattice

Total citations

Topics

h-index

Publication activitypapers/week, last 8 weeks

Research focus

Architecture Design (Transformers, SSMs, MoE) (1)Distributed Systems & Hardware (1)Inference & Quantization (1)Multimodal Models (1)

Frequent co-authors

Aditya Ukarande (1)Deep Shekhar (1)Deep Shekhar (1)Marc Blackstein (1)

Papers (1)

Apr 29, 2026

Aditya Ukarande +73w ago

Efficient, VRAM-Constrained xLM Inference on Clients

Squeezing high-accuracy LLMs and VLMs onto client devices is now significantly more feasible, thanks to a new pipelined sharding technique that achieves up to 30x speedups and 10x VRAM reduction.

Aditya Ukarande, Aditya Ukarande, Deep Shekhar +5

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization+1

Search

Aditya Ukarande

Publication activitypapers/week, last 8 weeks

Research focus

Frequent co-authors

Papers (1)