Abstract
An accurate and rapid detection is often required to monitor and prevent the field crop pests. However, the high miss rates and insufficient accuracy can also be caused by the small size and dense distribution of the field pests. In this study, a small object detection was proposed for the crop pest images using improved YOLOv8, named FCDM-YOLOv8. Firstly, the original C2f module was replaced with a lightweight C2f-FE module in the backbone network, in order to reduce the computational burden of the model. Additionally, the depthwise separable convolution (DWConv) was introduced to replace the ordinary convolutions in both the backbone and neck networks. Furthermore, the number of the parameters was reduced to effectively enhance the detection performance and operational efficiency. Secondly, a context aggregation network (context aggregation network, CONTAINER) was incorporated in the neck network. The contextual information was then strengthened to refine the feature representations. The accuracy of the detection was improved for the better capture and recognition of the dense pest groups. Thirdly, the model structure was adjusted to remove the P5 layer and the large object detection head in the backbone network. A small object detection layer was added after modifications. More feature information related to small object was retained to detect the pests of small sizes. Fourthly, the decoupled head in YOLOv8 was replaced with a dynamic detection head (dynamic head, and Dyhead). The dynamic detection head was adaptively adjusted the detection strategies, according to the density of object regions. The dense and small objects were effectively focused to extract the more useful feature information. Finally, Focaler-MPDIoU was selected as the bounding box loss function, in order to improve the detection accuracy and robustness on the small objects and difficult examples. An experiment was also carried out to validate the improved model. The result show that the FCDM-YOLOv8 model was achieved in the precision, recall, mAP0.5, and mAP0.5~0.95 of 81.4%, 73.5%, 80.1%, and 41.1%, respectively, on the self-collected and constructed dataset of the field environment pest. Compared with the baseline YOLOv8n, the FCDM-YOLOv8 model was improved precision by 2.0 percentage points, recall by 5.2 percentage points, mAP0.5 by 5.1 percentage points, and mAP0.5~0.95 by 2.8 percentage points. Additionally, the model size was reduced by 38.1%. Compared with the mainstream object detection (such as Faster R-CNN, SSD, and other YOLO series models), the FCDM-YOLOv8 model demonstrated the highest recall rate and mAP values, with the lowest memory footprint. Visual comparisons with the baseline model also showed that the FCDM-YOLOv8 model was significantly improved the detection accuracy for the less miss rates. Furthermore, the generalization experiments were conducted on the public dataset VisDrone2019. The precision, recall rate, mAP0.5, and mAP0.5~0.95 of the FCDM-YOLOv8 model reached 52.6%, 38.9%, 41.1%, and 24.3%, respectively, which were 7.7, 5.3, 7.7, and 4.9 percentage points higher than the baseline YOLOv8n. On the dataset COCO2017-small, the precision, recall rate, mAP0.5, and mAP0.5~0.95 of the FCDM-YOLOv8 model reached 44.8%, 29.0%, 28.3%, and 16.0%, respectively, which were 1.9, 1.6, 2.2, and 1.5 percentage points higher than the baseline model YOLOv8n. The FCDM-YOLOv8 model shared the outstanding generalization and detection accuracy. Finally, we developed a small target detection system for crop pests based on the FCDM-YOLOv8 model. The system deployed the FCDM-YOLOv8 model at the back end and integrated the PyQt5 framework at the front end. It can accurately identify and locate wheat spiders and aphids, providing technical support for precision pesticide application. In addition, the system can count the number of targets in each image to evaluate pest density. In summary, this research provides technical support for the intelligent detection of small targets of crop pests in field environments.