DUTLancaster UniversitySEUApr 12, 2026arXiv:2604.10494

From Characterization to Microarchitecture: Designing an Elegant and Reliable BFP-Based NPU

Jie Zhang, Jiapeng Guan, Hao Zhou, Xiaomeng Han, Tinglue Wang, Ran Wei, Zhe Jiang

AI Summary

This paper presents a detailed reliability analysis of Block Floating-Point (BFP)-based Neural Processing Units (NPUs) via RTL-level fault injection, uncovering heterogeneous vulnerabilities and the ineffectiveness of standard end-to-end checks due to nonlinear block scaling. Informed by these findings, the authors propose a fault-tolerant NPU microarchitecture that decouples mantissa and exponent computations with lightweight protection mechanisms. The resulting design achieves near-dual modular redundancy reliability with minimal performance (3.55% overhead) and hardware costs (under 2%).

Key Contribution

BFP-based NPUs, despite their efficiency, exhibit surprising vulnerabilities to hardware faults that can be mitigated with a novel microarchitecture incurring minimal overhead.

Abstract

Block Floating-Point (BFP) is emerging as an attractive data format for edge Neural Processing Units (NPUs), combining wide dynamic range with high hardware efficiency. However, its behavior under hardware faults and suitability for safety-critical deployments remain underexplored. Here, we present the first in-depth empirical reliability study of BFP-based NPUs. Using RTL-level fault injection on NPUs, our bit- and path-level analysis reveals pronounced heterogeneous vulnerabilities and shows conventional end-to-end check becomes ineffective under nonlinear block scaling. Guided by these insights, we design a fault-tolerant BFP-based NPU microarchitecture that aligns the BFP computational semantics with reliability constraints. The design uses a row/column-wise blocking strategy to decouple the fixed-point mantissa computations from the scalar exponent path, and introduces ultra-lightweight protection mechanisms for each. Experimental results demonstrate our design achieves near-dual modular redundancy reliability with only $3.55\%$ geometric mean performance overhead and less than $2\%$ hardware cost.

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

From Characterization to Microarchitecture: Designing an Elegant and Reliable BFP-Based NPU

Related Papers