高级检索+

基于改进YOLO v8n的果园梨果生长品质检测及双目定位方法

Orchard Pear Fruit Growth-Quality Detection and Binocular Localization Method Based on Improved YOLOv8n

  • 摘要: 在复杂果园环境下对梨果生长品质进行快速精确检测并做到实时采摘点定位是实现农业机器人智能化采摘的关键技术之一。针对目前复杂果园环境下梨果外观识别准确率低、检测速度不能满足实时采摘刚需等问题,该研究提出了一种基于YOLOv8n的改进型梨果检测方法。首先,将模型主干网络中的 C2f 模块替换为具有更高计算效率的FasterNet Block结构实现轻量化目标。其次,在网络中引入 GAM注意力机制(global attention mechanism,GAM),以增强关键特征信息的提取能力。最后采用Inner-CIoU损失函数,优化边界框回归过程,提高预测框的定位精度。试验结果表明,改进模型精确率、召回率和平均精度分别为96.80%、93.40%和96.70%,较基线提升4.0、3.2和4.0个百分点;浮点运算次数和内存占用量降低30.23%与48.15%;在嵌入式平台推理速度达180帧/s,功耗19W。进一步结合英特尔RealSense D455i双目相机进行三维标定与坐标系转换,实现了对健康梨果的精准空间定位。将此识别与定位系统部署于自主开发的梨果采摘执行机构,在室外开展了采摘平台试验,采摘成功率约为 90.2%,平均连续采摘速度约为 5 s/个。该研究有效解决了果实采收机器人视觉感知中的技术难题,且适用于性能受限设备的部署应用,为梨果等果实的采收机器人提供了关键的识别与定位技术支持。

     

    Abstract: Rapid and accurate detection of pear fruit growth quality combined with real-time picking-point localization in complex orchard environments is a key technological enabler for intelligent agricultural harvesting robots. To address the challenges of low recognition accuracy under occlusion and variable illumination, as well as insufficient inference speed on embedded devices, this study proposes an improved pear fruit detection and localization method based on YOLOv8n incorporating three targeted enhancements. First, the C2f modules in the backbone network are replaced with FasterNet Block structures utilizing partial convolution, significantly reducing computational redundancy and optimizing memory access efficiency. Second, a global attention mechanism is introduced after the spatial pyramid pooling fast layer to enhance the extraction of critical feature information, enabling the model to focus more effectively on small targets and suppress background interference. Third, the original CIoU loss function is replaced with Inner-CIoU using a scale factor of 0.8 after systematic experimentation, providing improved gradient characteristics and accelerating convergence while enhancing localization precision for small and overlapping pear fruits. A dedicated pear fruit dataset was constructed using images captured by an Intel RealSense D455i binocular camera in natural orchards, covering multiple varieties and challenging conditions including diseases, fruit overlap, and occlusion. Data augmentation expanded the dataset to 3,000 images. Experimental results demonstrate that the proposed YOLOv8n-Pear model achieves a precision of 96.80%, a recall of 93.40%, and a mean average precision of 96.70%. Compared with the baseline YOLOv8n, these metrics are improved by 4.0, 3.2, and 4.0 percentage points, respectively. Moreover, the model reduces floating-point operations by 30.23% and memory footprint by 48.15%, from 7.1 MB to 4.2 MB. On the embedded Jetson Orin NX platform, the model achieves an average inference speed of 180.3 frames per second with a power consumption of only 19 W, demonstrating its suitability for real-time deployment on power-constrained systems. For 3D localization, the binocular camera was calibrated using Zhang’s method, and a coordinate transformation pipeline was established to convert 2D pixel coordinates of detected healthy pear fruits into 3D world coordinates. Field tests show that the maximum positioning errors in the X, Y, and Z directions are 12 mm, 12 mm, and 10 mm, respectively, with average errors of 6.6 mm, 7.1 mm, and 7.1 mm, all within acceptable limits for robotic harvesting. This identification and localization system was deployed on a self-developed pear-picking actuator, and field tests were conducted outdoors with a picking platform. The success rate of fruit picking was approximately 90.2%, with an average continuous picking speed of about 5 seconds per fruit. This study effectively addresses technical challenges in visual perception for fruit harvesting robots and is suitable for deployment on resource-constrained devices. It provides critical recognition and localization support for harvesting robots targeting fruits such as pears.

     

/

返回文章
返回