Search papers, labs, and topics across Lattice.
This paper investigates post-training quantization of YOLOv4 for real-time object detection on NVIDIA Jetson Nano and AGX embedded GPUs. The study aims to reduce model size and computational complexity while preserving detection accuracy for deployment on resource-constrained edge devices. Results show that an 8-bit quantized YOLOv4 model achieves near real-time performance with minimal accuracy loss, making it suitable for applications like autonomous navigation.
Squeezing YOLOv4 into an 8-bit quantized version unlocks near real-time object detection on resource-strapped NVIDIA Jetson devices with minimal accuracy loss.
—Deploying deep learning-based object detection models like YOLOv4 on resource-constrained embedded architectures presents several challenges, particularly regarding computing performance, memory usage, and energy consumption. This study examines the quantization of the YOLOv4 model to facilitate real-time inference on lightweight edge devices, focusing on NVIDIA’s Jetson Nano and AGX. We utilize post-training quantization techniques to reduce both model size and computational complexity, all while striving to maintain acceptable detection accuracy. Experimental results indicate that an 8-bit quantized YOLOv4 model can achieve near real-time performance with minimal accuracy loss. This makes it well-suited for embedded applications such as autonomous navigation. Additionally, this research highlights the trade-offs between model compression and detection performance, proposing an optimization method tailored to the hardware constraints of embedded architectures.