基于改进YOLOv8n的轻量级番茄虫害检测模型

王娟; 李斯睿; 王春山; 张弛; 梁姿佳

doi:10.11975/j.issn.1002-6819.202503049

基于改进YOLOv8n的轻量级番茄虫害检测模型

Detecting tomato pests using lightweight improved YOLOv8n

摘要

摘要: 为了能够在移动设备上高效、精确地检测复杂环境下的番茄虫害，该研究以YOLOv8n为基线模型改进了轻量级番茄虫害检测模型YOLOv8n-DFS。将颈部替换为跨尺度特征融合（cross-scale feature fusion module，CCFM）结构，通过不同层的特征融合、调制与重用，在降低模型复杂度的同时，增强应对复杂场景的鲁棒性；引入ADown下采样模块替换部分卷积模块，通过更低的通道数和更小的空间维度降低模型的计算量与参数量，并以组合池化的方式保留检测性能；将轻量级自注意力机制融入YOLOv8n检测头，降低计算量，通过上下文捕捉能力增强复杂场景中的特征提取效果。试验结果表明，该研究构建的蚜虫、烟粉虱、地老虎、棉铃虫和斜纹夜蛾等5种虫害数据集上，改进后的YOLOv8n-DFS模型精确率、召回率和平均精度均值分别为91.5%、88.9%和93.5%，比基线模型的相应指标分别提升了0.7、1.8和0.5个百分点；浮点运算量、参数量和模型大小分别为3.2 G、1.497×10⁶和3.13 MB，比基线模型的相应指标分别降低了60.5%、50.2%和47.7%；与Faster R-CNN、SSD和YOLO系列其他5种原始模型（YOLOv7-tiny、YOLOv9-t、YOLOv10n、YOLOv11n和YOLOv12n）相比，平均精度均值分别提升了4.9、2.2、1.2、0.4、1.7、1.1和1.3个百分点，浮点运算量分别降低了945.0、56.9、10.0、7.5、5.0、3.1和3.3 G，参数量分别降低了26.8、10.7、4.5、1.1、1.2、1.1和1.1 M，模型大小分别降低了104.9、44.1、8.6、2.7、2.4、2.1和2.2 MB。在边缘计算设备部署试验中，YOLOv8n-DFS模型针对33例害虫，仅2例漏检、1例误检，与基线模型相比，检测结果更接近真实值，且帧率更高，达到了25.3帧/s。结果表明，YOLOv8n-DFS模型更加轻量化，检测效果更好，能够有效应对复杂情况的干扰，满足在移动端进行番茄虫害检测的高精度及轻量化需求。

Abstract: Green and pollution-free vegetables have been much more popular in recent years, as food safety increases. The fertilizers and pesticides can be controlled for safe and non-toxic vegetables. Alternatively, the pests can seriously threaten the yield and quality of the tomatoes. Timely and accurate identification of the pests is also conducive to the quality and yield of tomatoes with less fertilizer. However, there are diverse tomato pests in the form under complex backgrounds. Manual detection cannot fully meet the large-scale production, due to its high subjectivity and easy to miss. It is often required for the accurate detection of the tomato pests, and then the deployment of lightweight models, in order to balance the detection performance and complexity of the model. A lightweight improved model was proposed, named YOLOv8n-DFS, in order to efficiently and accurately detect the tomato pests in complex environments on edge computing devices. Firstly, the neck structure was replaced with the Cross-scale Feature Fusion (CCFM) to fuse, modulate, and reuse the features of different detection layers, in order to reduce the model complexity. Meanwhile, the model can intergrate feature information from multiple scales to enhance robustness in complex scenarios. Secondly, the ADown module was introduced to replace the part of the Conv module. The detection performance was then retained using a lower number of channels and a smaller spatial dimension via combined pooling. Dual optimization was achieved in both the channel and scale dimensions. CCFM was also combined to reduce the complexity of the model structure, the computational load, and the number of parameters of the model. Finally, a lightweight Self-Attention mechanism head was designed after optimization. The computational load of the detection head was reduced by combining the downsampling and upsampling. Moreover, the self-attention mechanism was utilized to enhance the global context information and long-distance dependencies. The feature extraction was improved in complex scenes. Additionally, the skip connection was added to the attention mechanism in order to avoid the overfitting caused by excessive reliance on the attention mechanism. The original feature was retained to suppress the irrelevant background. A series of tests was conducted on a self-built dataset, including five pests: Aphididae, Bemisia tabaci, Agrotis ypsilon, Helicoverpa armigera, and Spodoptera litura. The test results showed that the various modifications effectively enhanced the performance and lightweight of the YOLOv8n-DFS model. The detection performance and lightweight degree of the improved YOLOv8n-DFS model were superior to those of the mainstream models, including Faster R-CNN, SSD, YOLOv7-tiny, YOLOv8n, YOLOv9-t, YOLOv10n, YOLOv11n, and YOLOv12n. The precision, recall, and mAP of YOLOv8n-DFS reached 91.5%, 88.9% and 93.5%, respectively. Compared with the YOLOv8n model, the precision, recall, and mAP increased by 0.7, 1.8, and 0.5 percentage points, respectively. The FLOPs, parameter, and model size of YOLOv8n-DFS reached 3.2 G, 1.497×10⁶, and 3.13 MB, respectively. Compared with YOLOv8n, the FLOPs, parameters, and model size of YOLOv8n-DFS decreased by 60.5%, 50.2%, and 47.7%, respectively. Compared with the Faster R-CNN, SSD, and the rest five original models of the YOLO series, the mAP increased by 4.9, 2.2, 1.2, 0.4, 1.7, 1.1, and 1.3 percentage points, respectively. The FLOPs values were reduced by 945.0, 56.9, 10.0, 7.5, 5.0, 3.1, and 3.3 G, respectively, and the parameter sizes were reduced by 26.8, 10.7, 4.5, 1.1, 1.2, 1.1, and 1.1 M, respectively. The model size was reduced by 104.9, 44.1, 8.6, 2.7, 2.4, 2.1, and 2.2 MB, respectively. The deployment test was performed on the edge computing device. The frame rate of YOLOv8n-DFS reached 12.2 frame per second and 25.3 frame per second after TensorRT acceleration, which increased by 59.1%, compared with the YOLOv8n. The detection performances of the YOLOv8n-DFS on edge computing devices were closer to the true values than those of YOLOv8n. The YOLOv8n-DFS model targeted 33 pests, only 2 cases were missed, and 1 case was misdetected, indicating the lightweight and better detection. The finding can effectively cope with the interference of the complex situations, thus meeting the high precision and lightweight requirements of pest detection on the edge computing devices.

HTML全文

参考文献(40)

施引文献

资源附件(0)