基于CenterNet的密集场景下多苹果目标快速识别方法

杨福增; 雷小燕; 刘志杰; 樊攀; 闫彬

基于CenterNet的密集场景下多苹果目标快速识别方法

Fast Recognition Method for Multiple Apple Targets in Dense Scenes Based on CenterNet

摘要

摘要: 为提高苹果采摘机器人的识别效率和环境适应性，使其能在密集场景下对多苹果目标进行快速、精确识别，提出了一种密集场景下多苹果目标的快速识别方法。该方法借鉴“点即是目标”的思路，通过预测苹果的中心点及该苹果的宽、高尺寸，实现苹果目标的快速识别；通过改进CenterNet网络，设计了Tiny Hourglass-24轻量级骨干网络，同时优化残差模块提高了目标识别速度。试验结果表明，该方法在非密集场景下(即近距离场景)测试集的识别平均精度(Average precision, AP)为98.90%,F1值为96.39%;在密集场景下(即远距离场景)测试集的识别平均精度为93.63%,F1值为92.91%,单幅图像平均识别时间为0.069 s。通过与YOLO v3、CornerNet-Lite网络在两类测试集下的识别效果进行对比，该方法在密集场景测试集上比YOLO v3和CornerNet-Lite网络的平均精度分别提高了4.13、29.03个百分点；单幅图像平均识别时间比YOLO v3减少0.04 s、比CornerNet-Lite减少0.646 s。该方法无需使用锚框(Anchor box)和非极大值抑制后处理，可为苹果采摘机器人在密集场景下快速准确识别多苹果目标提供技术支撑。

Abstract: In order to improve the recognition efficiency and environmental adaptability of the apple picking robot, so that it can quickly and accurately recognize multiple apple targets in dense scenes, a rapid recognition method for multiple apple targets in dense scenes was proposed. The method drew on the idea of “point is the target”, and realized the rapid identification of apple targets by predicting the center point of apple and the width and height of apple. By improving the CenterNet network, the Tiny Hourglass-24 lightweight backbone network was designed, and the residual module was optimized to improve the target recognition speed. The test results showed that the average recognition accuracy of this method on the test set in non-dense scenes（images taken in close-range scenes） was 98.90%, and F1 was 96.39%. In the dense scene（images taken in the remote scene）, the recognition average precision（AP） of the test set was 93.63%, the F1 was 92.91%, and the average recognition time of a single image was 0.069 s. By comparing with the recognition effect of YOLO v3 and CornerNet-Lite network under the two types of test sets, the AP of this method was increased by 4.13 percentage points and 29.03 percentage points respectively on the dense scene test set. The average image recognition time was 0.04 s faster than that of YOLO v3 and 0.646 s faster than that of CornerNet-Lite. This method did not need to use anchor box and non-maximum suppression post-processing, and can provide technical support for the apple picking robot to quickly and accurately identify multiple apple targets in dense scenes.

HTML全文

参考文献(26)

施引文献

资源附件(0)