基于改进YOLOv8n的苹果幼果识别

师翊; 王应宽; 王菲; 卿顺浩; 赵龙; 宇文星璨

doi:10.11975/j.issn.1002-6819.202408222

摘要: 为提升苹果幼果的识别精度和效率，该研究提出了一种改进YOLOv8n模型的苹果幼果识别方法。首先，构建多样化、特征丰富的苹果幼果图像数据集，结合迁移学习技术，模型能够在复杂的果园环境中实现高效的特征提取。然后，将迁移学习方法与RetinaNet、EffcientDet、YOLOv5n、YOLOv8n和YOLOv10n模型结合，对比分析迁移学习方法和随机初始化权重对模型性能的影响，试验结果表明，基于迁移学习的YOLOv8n模型在保持轻量化的同时，实现了高准确度的检测效果，其检测精确度为99.3%，召回率为94.2%，mAP₅₀为97.0%，mAP_50-95为83.1%。为进一步提升模型精度，该研究引入了EMCA（efficient multiscale channel attention）模块，构建YOLOv8n-EMCA模型，该模型检测精确度为99.6%，召回率为95.6%，mAP₅₀为97.3%，mAP_50-95为88.2%。该研究不仅为苹果幼果识别提供了一种方法，也为其他果树幼果识别提供了参考。

Abstract: To enhance the precision and efficiency of young apple fruit recognition in agricultural settings, where timely identification is crucial for effective crop management and yield optimization, this paper introduces an innovative approach utilizing the YOLOv8n model, bolstered by transfer learning techniques. The recognition of young apple fruits presents unique challenges due to the variability in size, shape, and maturity stages, as well as the dynamic environmental conditions within orchards, such as varying lighting and occlusion by leaves or branches. To tackle these challenges, the study embarked on constructing a comprehensive and diverse dataset that captures the essence of young apple fruits under a wide range of scenarios. High-resolution digital cameras were employed to gather images of young apple fruits at different times of the day and throughout various stages of growth. Special attention was given to incorporating a spectrum of lighting conditions, from bright sunlight to shaded areas, to ensure the model's robustness against illumination changes. Furthermore, fruits of different sizes and maturity levels were included to reflect the natural heterogeneity within an orchard. These images underwent rigorous preprocessing, which involved meticulous cropping to focus solely on the fruits and augmentation techniques like rotation, flipping, and color adjustment to artificially diversify the dataset and prevent overfitting. The study then compared the efficacy of the transfer learning approach against the conventional method of random weight initialization across several state-of-the-art object detection models, including RetinaNet, EfficientDet, YOLOv5n, YOLOv8n, and YOLOv10n. The objective was to assess which model, when equipped with pre-trained weights from a related domain, could best adapt to the task of identifying young apple fruits with high accuracy and efficiency. Notably, the YOLOv8n model, known for its balance between performance and computational efficiency, emerged as the top performer when enhanced with transfer learning. It achieved remarkable detection precision of 99.3%, indicating a high degree of correctness in identifying young apple fruits, coupled with a recall rate of 94.2%, which underscores its ability to capture most of the relevant instances in the dataset. The average precision (AP) of 97.0% and extended average precision (AP) across multiple IoU thresholds (83.1%) further solidified its superiority in this specific detection task.To push the boundaries of detection accuracy even further, this research introduced the EMCA (Efficient Multiscale Channel Attention) module, integrating it into the YOLOv8n framework to create the YOLOv8n-EMCA model. The EMCA module is designed to enhance the model's capability to extract and process multi-scale features, enabling it to attend more effectively to critical details across different spatial resolutions. This refinement led to slight improvements in precision (99.6%) and a notable jump in recall to 95.6%, indicating fewer missed detections. Additionally, the mean Average Precision at 50% IoU (mAP50) reached 97.3%, and the more stringent mAP50-95 metric improved to 88.2%, demonstrating the model's robustness across a range of IoU thresholds. This research not only contributes a novel methodology tailored for the identification of young apple fruits but also serves as a valuable reference for the development of similar systems for other fruit trees. The findings underscore the transformative potential of transfer learning and advanced attention mechanisms in bolstering object detection capabilities, with profound implications for advancing agricultural automation and machine vision applications. By leveraging these techniques, the agriculture sector can move closer to achieving precision farming, where real-time, accurate monitoring and decision-making become the norm.

基于改进YOLOv8n的苹果幼果识别

Recognizing young apples using improved YOLOv8n