基于YOLOv8n-dy的玫瑰花花期多尺度目标检测方法

朱惠斌; 汤其林; 白丽珍; 李仕; 王明鹏; 董金燕; 李垚

doi:10.11975/j.issn.1002-6819.202507052

摘要: 为应对玫瑰采摘劳动强度大、采花农户老龄化和人工成本上升等挑战，针对玫瑰花采摘过程中检测识别准确性不足的问题，提出一种基于YOLOv8n模型的玫瑰花花期检测模型（YOLOv8n-dy），实现对花苞期、盛花期和败花期的精确判别。在主干网络中优化C2f模块，构建高效的多分支特征提取卷积结构，提高模型对多尺度、多形态花期目标的识别能力；引入ELA高效注意力机制（efficient lightweight attention），增添小目标检测层并将损失函数替换为WIOUv3损失函数（weighted intersection over union version 3）强化深层网络对提升模型对花苞期玫瑰花等小尺寸目标的定位检测能力；采用Adamax优化器（adaptive maximum optimization）解决模型陷入局部最优解问题。试验结果表明，YOLOv8n-dy模型的准确率(Precision)、召回率(Recall)和在50%阈值下（IoU=50%）平均精度均值(mAP₀.₅)分别达到79.2%、71.3%和77.3%，较原始YOLOv8n模型分别提升4.4%、6.7%和6.1%。同检测精度最高的YOLOv9比较，YOLOv8n-dy模型计算复杂度仅为9.8 GFLOPs，同时保持了与YOLOv9相当的50%交并比阈值下平均精度均值(0.773)，该模型的权重仅为YOLOv9模型的4.22%，显著提高模型部署的应用性能。该研究提出的YOLOv8n-dy模型在复杂田间环境下表现出优异性能，能高效准确地识别玫瑰花的花苞期（bud）、盛花期（flower）和败花期（withered），同时实现检测精度和计算效率的良好平衡，为玫瑰花采摘提供可靠的技术支持。

Abstract: This study addresses the significant challenges in the rose cultivation industry, including the intense physical demands of manual harvesting, an aging farmer demographic, and increasing labor costs, exacerbated by the lack of accurate and efficient automated vision systems for reliably identifying rose flowering stages. To tackle this, a novel model based on the YOLOv8n framework, designated as YOLOv8n-dy, was proposed for rose flowering stage detection. The model's backbone network was optimized by enhancing the C2f module to construct a more efficient multi-branch convolutional block, improving recognition capabilities for multi-scale and multi-morphology flowering targets. The Efficient Local Attention (ELA) mechanism was introduced to bolster the detection of small targets, and an additional layer for small target detection was added. The loss function was replaced with the Weighted Intersection over Union version 3 (WIOUv3) to enhance the model's ability to accurately locate and detect small-sized targets such as bud-stage roses. The Adamax optimizer was employed to address the issue of the model getting stuck in local optima. Experimental results demonstrated that the YOLOv8n-dy model achieved significant improvements over the original YOLOv8n model, with gains of 4.4% in precision, 6.7% in recall, and 6.1% in mAP₀.₅, respectively. In a comparative analysis against other state-of-the-art detectors, while YOLOv9 achieved the highest raw detection accuracy, the YOLOv8n-dy model maintained a highly competitive mAP₀.₅ value of 0.773. The most significant advantage of the proposed model was its dramatically reduced computational footprint, requiring only 9.8 Giga Floating Point Operations (GFLOPs), which is a mere 3.67% of the computational demand of the YOLOv9 model. Additionally, the weight of the YOLOv8n-dy model is only 4.22% of that of the YOLOv9 model, significantly enhancing the model's deployment and application performance. This exceptional efficiency did not come at the cost of field performance, as the model consistently and accurately identified all three flowering stages under diverse and challenging real-world field conditions, including varying lighting and occlusions. In conclusion, the YOLOv8n-dy model presented in this work successfully fulfills its design objectives by achieving an optimal and practical balance between high detection accuracy and low computational complexity. The strategic architectural innovations, including the optimized feature extraction blocks, attention mechanism, and advanced loss function, collectively contributed to its robust performance. The model's markedly low computational requirement makes it exceptionally suitable for real-time deployment on embedded systems and mobile platforms, which are critical for practical agricultural applications. Consequently, this research provides a reliable, efficient, and scalable technological solution for automating rose harvesting, potentially mitigating labor shortages, lowering operational costs, and promoting greater sustainability in modern floriculture and precision agriculture.

基于YOLOv8n-dy的玫瑰花花期多尺度目标检测方法

A multi-scale object detection method for rose flowering stages based on YOLOv8n-dy