Search papers, labs, and topics across Lattice.
This paper presents an FPGA implementation of a binarized YOLOv3-tiny-like object detector, quantizing weights to 1-bit and activations to 8-bit for most convolutional layers. To optimize performance, the design fuses channel-wise compensation directly into the BNN processing element. The resulting system achieves 39.6% mAP50 on VOC while consuming only 0.098 GFLOPs, demonstrating the feasibility of efficient object detection on low-cost FPGAs.
Achieve near-identical object detection results compared to the ONNX model while drastically reducing computational cost by implementing a binarized YOLOv3-tiny on a low-cost FPGA.
This paper implements a Binary Neural Network (BNN) based YOLOv3-tiny-like object detector on a low-cost FPGA. The network takes 320*320*3 RGB images as input. Its main convolution layers use 1-bit weights and 8-bit activations, while Conv1 and the final detection head use fixed-point standard convolutions. From the trained ONNX model, weights, biases, and quantization parameters are extracted, converted to fixed point, packed into COE files, and stored in Vivado BRAM ROMs. The hardware is written fully in Verilog RTL and includes padding, line buffering, binary convolution, quantization post-processing, max pooling, and detection-head computation. For layers where Mul_prev is indexed by input channel and Div_current by output channel, Mul_prev is fused in-to the BNN PE so that channel-wise compensation is applied during accumulation. On VOC, the model obtains 39.6% mAP50 with 0.098 GFLOPs and 0.74 M parameters. RTL simulation shows that the final raw detection output reaches a correlation coefficient of 0.999964 and a mean absolute error of 0.020027 against the corresponding ONNX node.