Jun 3, 2025

Quantized Object Detection for Real-Time Inference on Embedded GPU Architectures

Fatima Zahra Guerrouj, Sergio Rodriiguez Florez, Abdelhafid El Ouardi, Mohamed Abouzahir, M. Ramzi

AI Summary

This paper investigates post-training quantization of YOLOv4 for real-time object detection on NVIDIA Jetson Nano and AGX embedded GPUs. The study aims to reduce model size and computational complexity while preserving detection accuracy for deployment on resource-constrained edge devices. Results show that an 8-bit quantized YOLOv4 model achieves near real-time performance with minimal accuracy loss, making it suitable for applications like autonomous navigation.

Key Contribution

Squeezing YOLOv4 into an 8-bit quantized version unlocks near real-time object detection on resource-strapped NVIDIA Jetson devices with minimal accuracy loss.

Abstract

—Deploying deep learning-based object detection models like YOLOv4 on resource-constrained embedded architectures presents several challenges, particularly regarding computing performance, memory usage, and energy consumption. This study examines the quantization of the YOLOv4 model to facilitate real-time inference on lightweight edge devices, focusing on NVIDIA’s Jetson Nano and AGX. We utilize post-training quantization techniques to reduce both model size and computational complexity, all while striving to maintain acceptable detection accuracy. Experimental results indicate that an 8-bit quantized YOLOv4 model can achieve near real-time performance with minimal accuracy loss. This makes it well-suited for embedded applications such as autonomous navigation. Additionally, this research highlights the trade-offs between model compression and detection performance, proposing an optimization method tailored to the hardware constraints of embedded architectures.

Computer Vision Distributed Systems & Hardware Inference & Quantization

Citation Metrics

Citations6

Influential citations2

References30

Year2025

VenueInternational Journal of Advanced Computer Science and Applications

Related Papers

Finding related papers...

Search

Quantized Object Detection for Real-Time Inference on Embedded GPU Architectures

Related Papers