Search papers, labs, and topics across Lattice.
The paper introduces SHIELD8-UAV, a hardware accelerator for real-time UAV acoustic detection using a precision-aware 1D feature-driven CNN (1D-F-CNN). To achieve low-latency and low-power operation, the design employs sequential 8-bit execution on a shared multi-precision datapath, combined with layer-sensitivity quantization and structured channel pruning. Experimental results on a Pynq-Z2 FPGA and ASIC synthesis in UMC 40nm demonstrate significant latency reduction and low power consumption while maintaining high detection accuracy.
Achieve near-FP32 accuracy in UAV acoustic detection with an 8-bit CNN accelerator that slashes latency by nearly 50% and logic usage by up to 9% compared to parallel designs.
Real-time unmanned aerial vehicle (UAV) acoustic detection at the edge demands low-latency inference under strict power and hardware limits. This paper presents SHIELD8-UAV, a sequential 8-bit hardware implementation of a precision-aware 1D feature-driven CNN (1D-F-CNN) accelerator for continuous acoustic monitoring. The design performs layer-wise execution on a shared multi-precision datapath, eliminating the need for replicated processing elements. A layer-sensitivity quantisation framework supports FP32, BF16, INT8, and FXP8 modes, while structured channel pruning reduces the flattened feature dimension from 35,072 to 8,704 (75%), thereby lowering serialised dense-layer cycles. The model achieves 89.91% detection accuracy in FP32 with less than 2.5% degradation in 8-bit modes. The accelerator uses 2,268 LUTs and 0.94 W power with 116 ms end-to-end latency, achieving 37.8% and 49.6% latency reduction compared with QuantMAC and LPRE, respectively, on a Pynq-Z2 FPGA, and 5-9% lower logic usage than parallel designs. ASIC synthesis in UMC 40 nm technology shows a maximum operating frequency of 1.56 GHz, 3.29 mm2 core area, and 1.65 W total power. These results demonstrate that sequential execution combined with precision-aware quantisation and serialisation-aware pruning enables practical low-energy edge inference without relying on massive parallelism.