Abstract:
In order to achieve automation of pitaya harvesting, an improved U-Net based method for pitaya image segmentation and pose estimation was proposed. Firstly, a concurrent spatial and channel squeeze and channel exception(SCSE) module was introduced into the skip connection(connection operation between the encoder and decoder feature maps) of the U-Net model. At the same time, the SCSE module was integrated into the residual module double residual block(DRB) to enhance the network’s ability to extract effective features while improving its convergence speed, obtaining a pitaya image segmentation network based on attention residual U-Net. By using this network to segment mask images of fruits and their accompanying branches, image processing techniques and camera imaging models were used to fit the contours, centroids, minimum bounding rectangle boxes, and three-dimensional bounding boxes of fruits and their accompanying branches. Then based on the positional relationship of fruits and their accompanying branches, three-dimensional pose estimation of pitaya was performed. A test set was obtained in pitaya plantations to evaluate the performance of this algorithm. Finally, field picking experiments were conducted in a natural orchard environment. The experimental results showed that the average intersection and union ratio(mIoU) and the mean pixel accuracy(mPA) of image segmentation for pitaya fruit reached 86.69% and 93.89%, respectively. The average error of three-dimensional pose estimation was 8.8°. The success rate of pitaya fruit picking robot in orchard environment was 86.7%, and the average picking time was 22.3 s. The research results indicated that this method can provide technical support for developing an intelligent pitaya picking robot to achieve automated and precise picking.