NapoliFeb 16, 2026

Lightweight Thermal Udder Segmentation via Structured Pruning for On-Device Farm Deployment

Mattia Fonisto, Maria Teresa Verde, F. Bonavolontã, A. Liccardo, R. Matera, M. Santinello, Flora Amato

AI Summary

This paper focuses on developing a lightweight udder segmentation model for on-device deployment in farms, enabling continuous, non-invasive monitoring of udder health via thermal imagery. They explored compact binary segmentation models using DINOv3-initialized encoders (ConvNeXt and ViT) with lightweight decoders, employing isomorphic pruning and post-training quantization to reduce model size and computational cost. The resulting pruned ConvNeXtsmall+FPN model achieved 81.68% IoU with 25.44M parameters and 32.60B MACs, running at 81.6 ms per frame on an Nvidia Jetson Orin Nano.

Key Contribution

You can now run accurate thermal udder segmentation for livestock health monitoring on a low-power Nvidia Jetson Orin Nano at 81ms per frame.

Abstract

Accurate and automatic extraction of the udder skin surface temperature (USST) from thermal imagery enables continuous, non-invasive monitoring of udder health. The literature indicates that USST correlates with somatic cell count (SCC), the primary marker of mastitis activity, provided that udder regions are precisely segmented. Building on our previous system deployed on-farm, which integrated thermal imaging with robotic milking for real-time monitoring, we focus this paper on the segmentation component to obtain a precise yet lightweight, farm-deployable model. We explore compact binary segmentation models that pair the newly released DINOv3-initialised encoders with lightweight decoders: ConvNeXt + Feature Pyramid Network (FPN) and ViT + Dense Prediction Transformer (DPT). Our training follows four stages: ( $i$ ) train the decoder while keeping the encoder frozen; (ii) apply isomorphic pruning to remove redundant, shape-consistent channels and feature dimensions while preserving tensor interfaces; (iii) unfreeze the pruned encoder and fine-tune the whole network to recover performance; and (iv) apply post-training quantisation to produce FP16 variants. We evaluate encoders across scales and report test intersection over union (IoU), parameter count, and multiply–accumulate operations (MACs), selecting via the IoU-cost frontier at full precision. While a non-pruned ConvNeXttiny backbone attains the highest IoU, a pruned ConvNeXtsmall achieves comparable IoU at a lower cost and is therefore our Pareto-optimal choice for edge deployment. Such a model (ConvNeXtsmall+FPN) achieves an 81.68% IoU with 25.44M parameters and 32.60B MACs. It is exported to ONNX and quantised, and deployed over an Nvidia Jetson Orin Nano as a practical, offline, and privacy-preserving edge solution for farms. The deployed FP16-quantised ONNX model with CUDA-accelerated runtime achieves an average latency of 81.6 ms per frame at the 15W power profile on the Jetson Orin Nano. The dataset, source code, training logs, model weights, and Jetson-ready Docker image used for this study are released open-source to ensure full reproducibility and a fair comparison.

Computer Vision Inference & Quantization Robotics & Embodied AI

Citation Metrics

Citations0

Influential citations0

References47

Year2026

VenueIEEE Access

Related Papers

Finding related papers...

Search

Lightweight Thermal Udder Segmentation via Structured Pruning for On-Device Farm Deployment

Related Papers