高级检索+

基于YOLO-GESCW的盛花期红花轻量化检测方法

Lightweight detection method for safflower at full bloom stage based on YOLO-GESCW

  • 摘要: 为提高盛花期红花检测精度和检测实时性,该研究基于YOLOv8n提出一种YOLO-GESCW盛花期红花轻量化检测方法。通过引入幽灵卷积(Ghost Conv)降低卷积深度计算开销;采用高效通道注意力模块(efficient channel attention, ECA)替换主干网络及颈部网络部分C2f模块,减少模型参数量和计算量;引入空间金字塔池化高效层聚合网络(spatial pyramid pooling with efficient layer aggregation network, SPPELAN)模块增强多尺度特征提取能力;在颈部网络加入C3k2模块,提升轮廓与细节捕捉能力;使用Wise-IoUv3函数优化边界框损失。结果表明,改进后模型参数量约0.98 M、模型大小为2.09 MB、浮点计算量为3.4 G,较原YOLOv8n分别降低67.39%、65.10%和58.02%;测试集精确率为93.6%、召回率为93.0%、平均精度均值(mAP0.5)为98.1%,平均检测速度为106.79帧/s;在NVIDIA Jetson Xavier NX边缘设备部署测试,目标模型best.engine大小为3.48 MB、平均帧率52.36帧/s、单图平均推理耗时0.0072 s。该研究实现了检测模型轻量化与检测精度平衡,可为盛花期红花采收作业检测模型边缘设备部署提供解决方案。

     

    Abstract: This study aimed to improve the detection accuracy and real-time performance of safflower at the full bloom stage in complex field environments, where traditional models are constrained by high computational complexity and poor edge deployment adaptability. A lightweight detection method for safflower at full bloom stage named YOLO-GESCW was proposed based on YOLOv8n. The self-built dataset of safflower at full bloom was constructed, covering scenarios with different lighting, weather, target scales, and occlusion levels. Five targeted improvements were systematically implemented: Firstly, Ghost Convolution (Ghost Conv) replaced all standard convolutions in the backbone network and the 16th and 19th layer standard convolutions in the neck network to reduce computational overhead of convolution depth. Secondly, the Efficient Channel Attention (ECA) module substituted all C2f modules in the backbone network and the 12th and 21st layer C2f modules in the neck network, decreasing model parameters and computational cost for better lightweight performance. Thirdly, the Spatial Pyramid Pooling with Efficient Layer Aggregation Network (SPPELAN) module replaced the original SPPF (Spatial Pyramid Pooling - Fast) to fuse multi-scale features, enhancing detection capability for irregular small-scale targets. Fourthly, the C3k2 module was added to the neck network to improve capture of object contours and fine details. Finally, the Wise-IoUv3 loss function replaced the original one to reduce the error between predicted and ground-truth boxes. Comprehensive experiments were conducted, including comparative analysis with nine relevant models in the field, including mainstream models YOLOv9t, YOLOv10n, YOLOv11n, YOLOv12n, and Gradient-weighted Class Activation Mapping (Grad-CAM) visualization for the 7th–9th layers of the backbone networks of YOLOv8n and YOLO-GESCW. Deployment verification was performed on the NVIDIA Jetson Xavier NX edge device. Results showed that YOLO-GESCW achieved a parameter count of 0.98 M, model size of 2.09 MB, and FLOPs of 3.4 G, which were 67.39%, 65.10%, and 58.02% lower than YOLOv8n, respectively. On the test set, it reached 93.6% precision, 93.0% recall, and 98.1% mean average precision (mAP@0.5), with an average detection speed of 106.79 frames per second. Compared with the four mainstream models, its parameters were reduced by 50.25%, 56.83%, 62.02%, and 61.72%, and FLOPs were reduced by 55.3%, 47.7%, 46.0%, and 46.0%, respectively. Among the nine comparative models, YOLO-GESCW had the highest mAP@0.5 (98.15%): its precision was 0.5 percentage points lower than SF-YOLO, but higher than others, and its recall was 0.9 percentage points lower than Improved YOLOv7, but higher than the rest. Notably, it was the model with parameters less than one million (0.98 M), consuming fewer computational resources to better meet edge deployment lightweight requirements. Grad-CAM results confirmed the improved model focused more on targets, consistent with expected optimizations. Deployed on the edge device, the 3.48 MB best.engine model achieved 52.36 frames per second and 0.0072 s single-image inference time on 757 unseen images from the self-built dataset. The YOLO-GESCW model effectively balanced lightweight performance, high detection accuracy, and real-time responsiveness, overcoming the key limitations of traditional models in complex field scenarios. It provided a reliable technical reference for the real-time detection of safflower at the full bloom stage and its practical deployment on resource-constrained edge devices, laying a solid foundation for the development of intelligent safflower harvesting technology.

     

/

返回文章
返回