基于RGB-D图像的高纺锤形苹果树修剪执行末端位姿估计

康峰; 王嘉成; 王亚雄; 王宁

doi:10.11975/j.issn.1002-6819.202502099

摘要: 针对果树智能化选择性修剪作业中缺乏执行末端位姿的预测方法，提出一种基于RGB-D图像的修剪点定位与执行末端位姿估计方法。首先，利用RealSense D435i深度相机获取苹果树的RGB图像和深度信息，并通过改进的YOLOv8-seg模型对RGB图像中的树干与一级枝根部预设区域进行分割。其次，利用OpenCV图像处理方法，计算出树干直径、侧枝直径和侧枝生长间距，基于人工剪枝规则，判断并定位修剪点像素坐标。最后，结合修剪点坐标与深度信息，基于点面映射原理推导出修剪点对应的修剪位姿。结果表明，改进的YOLOv8-seg模型在自建数据集掩膜预测的精确率和召回率分别达到了95.31%和93.79%，修剪决策判断的正确率为88.3%，执行末端位姿估计的成功率为89.9%，单个图像的平均推导时间为5.4 s，可以为苹果树修剪机器人的研发提供技术支持。

Abstract: Pruning is one of the most critical steps in the cultivation of fruit trees. Current pruning robots have realized to recognize the side branch, and then locate the pruning points in recent years. However, it is still lacking in the effective end-effector pose estimation in intelligent selective pruning. This study aims to propose the pruning point localization and end-effector pose estimation using RGB-D images. The research object was also selected as the dormant high spindle-shaped apple trees. A depth camera (Intel RealSense D435i) was utilized to capture the RGB and depth data. A point-to-plane mapping was introduced to derive the 3D orientation and position of the pruning pose from the detected pixel coordinates and depth information. The spatial location was predicted for the cutting plane’s orientation relative to the pruning point — a key requirement for autonomous robotic pruning. In the perception pipeline, an improved version of the YOLOv8-seg model was employed to segment the trunk and primary branch regions from the RGB images. Furthermore, it was often lacking on the clear boundary features of the branch base masks, due to the unconventional annotation. The original YOLOv8-seg model failed to accurately locate and then segment these regions. A Global Attention Mechanism (GAM) module was introduced into the neck network of YOLOv8-seg. Each C2f block was then integrated across all feature levels. The feature maps were also recalibrated using channel-wise multiplication, in order to enhance the salient features while suppressing the irrelevant ones. The multi-scale information and reasoning were significantly enhanced for the high accuracy of the segmentation. The improved YOLOv8-seg model was achieved in a mask-level precision of 95.31%, recall of 93.79%, and an mAP_0.5 of 93.86%, thus outperforming the original YOLOv8-seg by 0.79 percentage points in precision, 2.63 percentage points in recall, and 1.47 percentage points in mAP_0.5. Once the trunk and primary branches were segmented, the OpenCV-based image processing was applied to calculate the diameters and spacing of the branches. The potential pruning points were identified to fit the rectangles around the base regions of the side branches, according to the empirical pruning. Field trials were carried out to validate the effectiveness of this approach. A better performance was achieved, with a decision accuracy of 88.3% and an average processing speed of 2.1 seconds per image. Extensive testing showed that the point-to-plane mapping of the pose estimation was achieved with a success rate of 89.9%, with an average computation time of 3.3 s per image. The total time of the average processing was 5.4 seconds per image from the image acquisition to the pose estimation. In conclusion, a framework was presented for the intelligent selective pruning of the apple trees using RGB-D input, in order to realize the accurate pruning point localization and end-effector pose estimation. Advanced deep learning models were also integrated with the image processing. The pruning pose can be expected to align with the specific angles for the tree's health. The point-to-plane mapping can be expected to determine the spatial location of the pruning points. The optimal orientation of the cutting plane can also be calculated to fully meet the horticultural requirements of the pruning actions. Specifically, the normal vector of the cutting plane was derived, according to the detected pruning points and surrounding branch structures. The manipulator's reachability and safety distances can be considered to generate feasible pruning poses for practical execution. The pruning end-effector pose estimation can also provide strong support for developing robotic pruning.

基于RGB-D图像的高纺锤形苹果树修剪执行末端位姿估计

Estimating end-effector pose for pruning tall-spindle apple trees using RGB-D images