Chain ReactionFeb 23, 2026arXiv:2602.19550

Hardware-Friendly Randomization: Enabling Random-Access and Minimal Wiring in FHE Accelerators with Low Total Cost

Ilan Rosenfeld, Ilan Rosenfeld, Noam Kleinburd, Noam Kleinburd, Hillel Chapman, Hillel Chapman, Dror Reuven Chain Reaction, Dror Reuven, Ltd

AI Summary

The paper proposes a hardware-friendly scheme for generating the uniformly random polynomial 'a' in RLWE-based FHE accelerators, aiming to reduce communication overhead and improve hardware efficiency. The scheme allows parallel generation of uniformly distributed samples with relaxed wiring requirements and unrestricted random-access to RNS limbs. The approach achieves a low overhead on the client side (less than 3%) during key generation and reduces power consumption in high-throughput configurations.

Key Contribution

Cut the Watts: This hardware trick slashes power consumption in FHE accelerators by generating randomness on-the-fly, ditching bulky wiring and boosting throughput.

Abstract

The Ring-Learning With Errors (RLWE) problem forms the backbone of highly efficient Fully Homomorphic Encryption (FHE) schemes. A significant component of the RLWE public key and ciphertext of the form $(b,a)$ is the uniformly random polynomial $a \in R_q$ . While essential for security, the communication overhead of transmitting $a$ from client to server, and inputting it into a hardware accelerator, can be substantial, especially for FHE accelerators aiming at high acceleration factors. A known technique in reducing this overhead generates $a$ from a small seed on the client side via a deterministic process, transmits only the seed, and generates $a$ on-the-fly within the accelerator. Challenges in the hardware implementation of an accelerator include wiring (density and power), compute area, compute power as well as flexibility in scheduling of on-the-fly generation instructions. This extended abstract proposes a concrete scheme and parameters wherein these practical challenges are addressed. We detail the benefits of our approach, which maintains the reduction in communication latency and memory footprint, while allowing parallel generation of uniformly distributed samples, relaxed wiring requirements, unrestricted randomaccess to RNS limbs, and results in an extremely low overhead on the client side (i.e. less than 3%) during the key generation process. The proposed scheme eliminates the need for thick metal layers for randomness distribution and prevents the power consumption of the PRNG subsystem from scaling prohibitively with the acceleration factor, potentially saving tens of Watts per accelerator chip in high-throughput configurations.

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References26

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Hardware-Friendly Randomization: Enabling Random-Access and Minimal Wiring in FHE Accelerators with Low Total Cost

Related Papers