Abstract:
Rapeseed is one of the most important oilseed crops in China. Among them, weed has been one of the major limiting factors on its yield in farmland. Effective removal of the weed is often required to enhance the rapeseed production. The intelligent equipment can be expected for the precise pesticide application during mechanical weeding in recent years. Furthermore, machine vision can also be used to accurately and rapidly recognize and locate the rapeseed and weeds in the field. In this study, a real-time recognition algorithm (CSE-YOLOXs) was proposed to accurately detect the rapeseed and weeds in complex environments, such as the small target sizes and occlusions. Firstly, the YOLOXs were used as the base model. Two attention mechanisms (CBAM, Convolutional Block Attention Module, and CA, Coordinate Attention) were introduced to reduce the background interference for the focus on the rapeseed and weeds. The performance of the improved was then evaluated after the backbone and FPN (Feature Pyramid Network) outputs. Secondly, the high-resolution feature maps (F2) from the backbone were fused with the feature maps at the different scales in the FPN. The Swin Transformer Block structure and a P2 prediction head were incorporated into a high-resolution Swin Transformer prediction head, in order to extract the richer global features for the accurate localization of small targets. Finally, the EIoU (Enhanced Intersection over Union) loss function was adopted to more effectively represent the proximity between predicted and ground truth boxes. The high accuracy of the improved model was obtained to regress the target positions and shapes for the more accurate localization of the rapeseed and weeds. The performance of the improved CSE-YOLOXs model was evaluated to compare with six object detection models (SSD, Faster R-CNN, YOLOv5s, YOLOv7, YOLOv8s, and YOLOXs). Furthermore, three field experiments were conducted to deploy the CSE-YOLOXs on a Mechanist F117 laptop with an RTX 3060 GPU. A smartphone via a local area network was connected to acquire the rapeseed and weed images for real-time recognition under various sunny, cloudy, and overcast weather. The experimental results show that the CSE-YOLOXs model achieved an average precision (AP) of 90.03% and 81.58% for the rapeseed and weeds, respectively, with only 9.61M parameters and a frame rate of 68 fps, thus fully meeting the requirements of the real-time detection. The CA attention mechanism after FPN performed best with improvements of 1.14 and 1.54 percentage points in the mAP and mF1, respectively. The accuracy of the detection on the small targets was improved, with the AP increase by 2.20 and 5.23 percentage points for the rapeseed and weeds, respectively. The mAP and mF1 were enhanced by 3.87 and 5.26 percentage points, respectively. The IoU loss function was also replaced with the EIoU one. There were increases of 0.28 and 0.96 percentage points in the mAP and mF1, respectively. Compared with the base model YOLOXs, the mAP and mF1 were improved by 5.29 and 7.76 percentage points, respectively, indicating significantly enhanced performance. Compared with the six object detection models—SSD, Faster R-CNN, YOLOv5s, YOLOv7, YOLOv8s, and YOLOXs—the average precision was improved by 18.43, 32.18, 7.90, 2.74, 6.25, and 5.29 percentage points, respectively. Additionally, the CSE-YOLOXs achieved an average precision of 83.77% in the field tests, with an average detection time of about 33 ms per image. The high robustness and real-time detection were also achieved in the improved model. The finding can provide technical support for precise pesticide spraying and intelligent weeding in rapeseed fields.