Detecting insect pests on sticky traps based on YOLOv11n and image slice inference
-
Graphical Abstract
-
Abstract
Insect pests, such as the thrips and whiteflies, have posed a serious threat to the crop yield and quality in greenhouse environments. It is often required to timely and accurately detect insect pests. Among them, yellow sticky traps have been widely used to monitor pest populations. However, manual inspection cannot achieve the high accuracy to prevent crop damage, due to the small size and dense distribution of these pests. In this study, an improved pest detection was proposed using the YOLOv11n-LMN model with the slicing aided hyper inference (SAHI) framework. The small and densely distributed pests were effectively detected to maintain the high model efficiency, particularly for the deployment on edge devices. The small-target detection accuracy was also enhanced to significantly reduce model size and computational complexity. Specifically, the original backbone network was replaced with a lightweight CPU-oriented convolutional neural network, PP-LCNet. Depthwise separable convolutions and the H-Swish activation function were employed to integrate the squeeze-and-excitation (SE) attention mechanism. Feature representation was further improved to reduce the parameters. In addition, a multi-scale similarity-aware module (MulSimAM) was introduced to extract the multi-scale features. Attention weights were adaptively allocated on the different scales. Fine-grained features of small targets were captured to exploit the cross-scale feature correlations, thereby improving the detection accuracy and robustness. Furthermore, the normalized wasserstein distance (NWD) loss function was incorporated to improve localization accuracy for small objects. Modeling bounding boxes was selected over the Gaussian distributions, rather than the conventional IoU-based metrics, such as the complete IoU (CIoU), in dense scenarios. The large images were divided into overlapping slices using the SAHI strategy. The local details were also preserved for accurate and rapid computation in the high-resolution sticky trap images. The slice-level detection was merged and then refined using non-maximum suppression (NMS), resulting in a substantial improvement in the small-object detection. Experimental results demonstrate that the YOLOv11n-LMN model significantly outperformed the baseline YOLOv11 in terms of precision, recall, and mean average precision (mAP). The detection accuracy was improved by 3.1 and 4.9 percentage points, respectively, for the thrips and whiteflies, while the model size was reduced by 4.7 MB. The YOLOv11n-LMN+SAHI model achieved a mAP by 40.4 percentage points on the full armyworm sticky trap images. The original YOLOv11n model was obtained in a mAP of 49.2%. On the yellow sticky traps (YST) dataset, a mAP@50 of 93.9% was 2 percentage points higher than the baseline model after evaluation. Finally, both the baseline and improved models were deployed on a Raspberry Pi platform. The high performance was achieved in an average inference time of 9.6 s per sticky trap image, indicating the strong practical application. These findings can provide an efficient and deployable solution for intelligent pest monitoring in greenhouse agriculture. The valuable technical support can also offer precision farming applications.
-
-