May 21, 2026arXiv:2605.22237

Decision-Aware Quadratic ReLU Replacement for HE-Friendly Inference

AI Summary

This paper introduces a decision-aware method for replacing ReLU activations with quadratic polynomials in neural networks for homomorphic encryption (HE) inference. They formulate the quadratic replacement as a linear separation problem when the calibration set is positive-margin separable, providing necessary and sufficient conditions for lossless replacement. For non-separable cases, they use reduced convex hulls and Lagrangian-dual soft-margin relaxations to find approximate solutions, achieving significant speedups compared to higher-degree polynomial approximations like Remez-7 while maintaining plaintext accuracy.

Key Contribution

Ditch the high-degree polynomials: decision-aware quadratic ReLU replacements can slash homomorphic encryption inference time by up to 4x without sacrificing accuracy.

Abstract

Fully homomorphic encryption (FHE) supports only additions and multiplications, so FHE-only neural-network inference typically replaces ReLU with polynomials fitted over empirical activation intervals. Such interval fitting often requires higher-degree polynomials to control activation error, incurring homomorphic evaluation costs, while classification is determined by the final logit decision. We revisit ReLU replacement from a decision-aware perspective: given a trained single-hidden-layer ReLU MLP and a specified calibration set, can an HE-friendly low-degree polynomial replace ReLU without retraining while preserving calibration-set decisions? We focus on quadratic replacement, the lowest-degree choice that retains a genuine per-unit nonlinearity. For calibration sets positive-margin separable in the lifted space, we formulate quadratic replacement as a linear separation problem, yielding necessary and sufficient conditions for calibration-lossless replacement and a constructive algorithm for the coefficients. When the positive-margin condition fails -- typically due to a few misclassified calibration samples -- we extend the same geometric framework via reduced convex hulls and Lagrangian-dual soft-margin relaxations, which bound the influence of any single sample, converting the problem into smaller convex quadratic programs that yield approximately feasible coefficients with high empirical agreement on calibration-set decisions. In particular, at the maximal weight cap $μ=1$, the reduced-convex-hull relaxation reduces to the convex-hull separation of the strictly separable case; the relaxation thus continuously extends the exact theory. Under CKKS, the quadratic replacement matches plaintext top-1 accuracy on multiple benchmarks, running 3.7--4.1$\times$ faster than Remez-7 in the activation module and 1.18--1.68$\times$ faster end-to-end.

Architecture Design (Transformers, SSMs, MoE)Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Decision-Aware Quadratic ReLU Replacement for HE-Friendly Inference

Related Papers