MIT CSAILGeorgia TechApr 20, 2026arXiv:2604.17808

Enabling AI ASICs for Zero Knowledge Proof

Jianming Tong, Jing Dang, Jingtian Dang, Simon Langowski, Tianhao Huang, Asra Ali, Jeremy Kun, Jevin Jiang, Srini Devadas, Srinivas Devadas, Tushar Krishna

AI Summary

This paper introduces MORPH, a framework that reformulates zero-knowledge proof (ZKP) kernels to align with the architecture of AI ASICs like TPUs, thereby accelerating ZKP proving. MORPH introduces "Big-T complexity," a hardware-aware model, to guide optimizations at both the arithmetic level (MXU-centric extended-RNS lazy reduction) and dataflow level (unified-sharding layout-stationary TPU Pippenger MSM and optimized NTT). Experiments on TPUv6e8 demonstrate up to 10x throughput improvement on NTT and comparable throughput on MSM compared to GZKP.

Key Contribution

ZKP proving, previously bottlenecked by MSM and NTT operations, can now achieve up to 10x higher throughput on TPUs thanks to a novel framework that reformulates ZKP kernels for AI-ASIC execution.

Abstract

Zero-knowledge proof (ZKP) provers remain costly because multi-scalar multiplication (MSM) and number-theoretic transforms (NTTs) dominate runtime as they need significant computation. AI ASICs such as TPUs provide massive matrix throughput and SotA energy efficiency. We present MORPH, the first framework that reformulates ZKP kernels to match AI-ASIC execution. We introduce Big-T complexity, a hardware-aware complexity model that exposes heterogeneous bottlenecks and layout-transformation costs ignored by Big-O. Guided by this analysis, (1) at arithmetic level, MORPH develops an MXU-centric extended-RNS lazy reduction that converts high-precision modular arithmetic into dense low-precision GEMMs, eliminating all carry chains, and (2) at dataflow level, MORPH constructs a unified-sharding layout-stationary TPU Pippenger MSM and optimized 3/5-step NTT that avoid on-TPU shuffles to minimize costly memory reorganization. Implemented in JAX, MORPH enables TPUv6e8 to achieve up-to 10x higher throughput on NTT and comparable throughput on MSM than GZKP. Our code: https://github.com/EfficientPPML/MORPH.

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References34

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Enabling AI ASICs for Zero Knowledge Proof

Related Papers