Search papers, labs, and topics across Lattice.
COTONET, a custom YOLOv11 model, was developed to detect cotton capsules at various growth stages for automated harvesting. The model incorporates attention mechanisms, Squeeze-and-Exitation blocks, CARAFE upsampling, SimAM, and PHAM to enhance feature extraction and improve detection of difficult instances. COTONET achieves a mAP50 of 81.1% and a mAP50-95 of 60.6%, outperforming standard YOLO baselines while maintaining a small model size suitable for edge computing.
By swapping in attention mechanisms and novel upsampling, COTONET achieves state-of-the-art cotton boll detection with a YOLOv11 architecture, enabling more precise automated harvesting.
Cotton harvesting is a critical phase where cotton capsules are physically manipulated and can lead to fibre degradation. To maintain the highest quality, harvesting methods must emulate delicate manual grasping, to preserve cotton's intrinsic properties. Automating this process requires systems capable of recognising cotton capsules across various phenological stages. To address this challenge, we propose COTONET, an enhanced custom YOLO11 model tailored with attention mechanisms to improve the detection of difficult instances. The architecture incorporates gradients in non-learnable operations to enhance shape and feature extraction. Key architectural modifications include: the replacement of convolutional blocks with Squeeze-and-Exitation blocks, a redesigned backbone integrating attention mechanisms, and the substitution of standard upsampling operations for Content Aware Reassembly of Features (CARAFE). Additionally, we integrate Simple Attention Modules (SimAM) for primary feature aggregation and Parallel Hybrid Attention Mechanisms (PHAM) for channel-wise, spatial-wise and coordinate-wise attention in the downward neck path. This configuration offers increased flexibility and robustness for interpreting the complexity of cotton crop growth. COTONET aligns with small-to-medium YOLO models utilizing 7.6M parameters and 27.8 GFLOPS, making it suitable for low-resource edge computing and mobile robotics. COTONET outperforms the standard YOLO baselines, achieving a mAP50 of 81.1% and a mAP50-95 of 60.6%.