Feb 15, 2026arXiv:2602.14040

Explainability-Inspired Layer-Wise Pruning of Deep Neural Networks for Efficient Object Detection

AI Summary

The paper introduces an explainability-inspired layer-wise pruning framework for object detection networks, addressing the limitations of magnitude-based pruning by incorporating functional contribution. They use a SHAP-inspired gradient-activation attribution to estimate layer importance, guiding the pruning process based on data-driven insights. Experiments on various object detection architectures (ResNet-50, MobileNetV2, ShuffleNetV2, Faster R-CNN, RetinaNet, and YOLOv8) demonstrate that this approach achieves better accuracy-efficiency trade-offs compared to L1-norm pruning, particularly for ShuffleNetV2 and RetinaNet.

Key Contribution

L1-norm pruning is leaving performance on the table: SHAP-inspired pruning boosts object detection efficiency by up to 10% without sacrificing accuracy.

Abstract

Deep neural networks (DNNs) have achieved remarkable success in object detection tasks, but their increasing complexity poses significant challenges for deployment on resource-constrained platforms. While model compression techniques such as pruning have emerged as essential tools, traditional magnitude-based pruning methods do not necessarily align with the true functional contribution of network components to task-specific performance. In this work, we present an explainability-inspired, layer-wise pruning framework tailored for efficient object detection. Our approach leverages a SHAP-inspired gradient--activation attribution to estimate layer importance, providing a data-driven proxy for functional contribution rather than relying solely on static weight magnitudes. We conduct comprehensive experiments across diverse object detection architectures, including ResNet-50, MobileNetV2, ShuffleNetV2, Faster R-CNN, RetinaNet, and YOLOv8, evaluating performance on the Microsoft COCO 2017 validation set. The results show that the proposed attribution-inspired pruning consistently identifies different layers as least important compared to L1-norm-based methods, leading to improved accuracy--efficiency trade-offs. Notably, for ShuffleNetV2, our method yields a 10\% empirical increase in inference speed, whereas L1-pruning degrades performance by 13.7\%. For RetinaNet, the proposed approach preserves the baseline mAP (0.151) with negligible impact on inference speed, while L1-pruning incurs a 1.3\% mAP drop for a 6.2\% speed increase. These findings highlight the importance of data-driven layer importance assessment and demonstrate that explainability-inspired compression offers a principled direction for deploying deep neural networks on edge and resource-constrained platforms while preserving both performance and interpretability.

Computer Vision Inference & Quantization Interpretability & Mechanistic Interp

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Explainability-Inspired Layer-Wise Pruning of Deep Neural Networks for Efficient Object Detection

Related Papers