基于改进YOLO11n的轻量化葡萄疏果期浆果检测方法

冷欣; 陈涛; 李永康; 黄建平; 宋文龙

doi:10.11975/j.issn.1002-6819.202510073

基于改进YOLO11n的轻量化葡萄疏果期浆果检测方法

A lightweight grape berry detection method during the thinning period based on improved YOLO11n

摘要

摘要: 为满足葡萄早期产量预估及疏果期葡萄疏伐作业对精准浆果信息的需求，并应对实际农业场景中检测模型部署受限的挑战，该研究聚焦于高精度、低参数量与实时性之间的协同优化，提出一种基于改进YOLO11n的轻量化葡萄疏果期浆果检测方法BD-YOLO。首先，在骨干网络（Backbone）中采用视觉变换器（vision transformer, ViT）结构设计的轻量级卷积神经网络 RepViT 替换原有骨干网络，通过深度卷积（depthwise convolution, DWConv）与重参数化技术，在保证模型基础特征提取的同时降低参数量与浮点运算量；其次，为缓解轻量化可能导致的精度下降，在检测头部分设计了一种双层高分辨率检测结构，以强化模型对小目标浆果及背景细节的特征捕捉能力；最后，考虑遮挡等不同难度学习样本以及针对浆果目标边界框宽高比近似但尺寸各异的特点，结合最小点距离（minimum point distance, MPD）与Focaler-IoU思想设计Focaler-MPDIoU损失函数，以优化定位精度。试验结果表明，BD-YOLO在自建数据集上，相比基线模型的参数量、模型体积与浮点运算量分别下降了89.2%、81.8% 与 9.5%，同时平均精度均值（mean average precision, mAP）与召回率别提升了3.5与4.3个百分点，检测速度由114.2帧/s提升至142.1帧/s。在公开数据集泛化能力对比中，BD-YOLO平均精度均值达到89.8%，较自建数据集（91.8%）无显著性能衰减，展现出较强的检测鲁棒性。此外，在边缘设备上的部署测试表明，BD-YOLO 的推理速度达到35.6 帧/s，较基线模型提升26.7%。该研究可为葡萄疏果期浆果精准检测以及未来果园边缘设备上的低成本、规模化部署提供具备应用潜力的算法参考。

Abstract: Precision viticulture can rely heavily on the acquisition of berry phenotypic data, particularly for the early grape yield estimation and fruit thinning. The object detection models are often deployed under unstructured agricultural environments. However, two challenges are severely constrained to the high detection precision under complex backgrounds and the computational limitations of embedded edge devices. It is required to effectively reconcile the competing requirements of high detection precision, low parameter count, and real-time inference speed. In this study, an improved lightweight YOLO11n (BD-YOLO) architecture was proposed to detect the grape berry during the thinning period. Three strategies were utilized to extend the BD-YOLO architecture using the baseline model. 1) In model redundancy, the original Backbone was replaced with RepViT. A lightweight convolutional neural network was designed with a Vision Transformer (ViT) architecture. Depthwise convolutions (DWConv) and structural reparameterization were used to reduce the parameter counts and floating-point operations (FLOPs) in the backbone, thus preserving fundamental feature extraction. 2) The potential precision degradation was often associated with a lightweight aggressive model. A dual-layer high-resolution structure was designed in the detection head. The sensitivity to small-scale grape berry targets was enhanced to extract the subtle background details in the module. 3) Focaler-MPDIoU loss function was introduced to consider the varying learning samples caused by mutual occlusion and the specific morphological features, where grape berry targets shared similar aspect ratios and different physical sizes. Minimum point distance (MPD) was combined with the Focaler-intersection over union (IoU) to optimize bounding box regression error. Extensive experiments verified the superiority of the improved model. On the custom dataset, BD-YOLO's parameter count, model size, and floating-point operations were drastically reduced by 89.2%, 81.8%, and 9.5%, respectively, compared with the baseline YOLO11n. Simultaneously, the mean average precision (mAP) and recall rate were substantially improved by 3.5 and 4.3 percentage points, reaching 91.8% in mean average precision, while the detection speed increased significantly from 114.2 to 142.1 frames per second (FPS). Furthermore, the BD-YOLO attained a mean average precision of 89.5% after cross-validation for the generalization on public datasets. There was no significant performance degradation compared with the custom dataset (91.8%), indicating strong robustness. Crucially, hardware deployment tests on actual edge devices indicated that the BD-YOLO also achieved a real-time inference speed of 35.6 FPS, which was improved by 26.7% over the baseline model. Computational costs were successfully minimized for high accuracy. This finding can provide a highly promising reference for the precise detection of grape berries during the thinning period. Robustness and efficiency can be expected to serve as potential candidates for the low-cost, large-scale deployment in smart orchards.

HTML全文

参考文献(37)

施引文献

资源附件(0)