基于三视角成像的苹果表面缺陷检测方法

孙国祥; 陈光宇; 汪小旵; 卢伟; 张晓蕾; 施印炎

doi:10.11975/j.issn.1002-6819.202510123

基于三视角成像的苹果表面缺陷检测方法

Three-view imaging-based method for defect detection on apple surfaces

摘要

摘要: 为解决传统单视角成像在苹果表面缺陷检测中存在大量盲区的问题，该研究提出了一种基于三视角成像的苹果表面缺陷检测方法。首先，设计了一套三视角成像的苹果表面缺陷检测系统，通过三视角点云配准重建苹果模型，以准确获取系统成像区域面积；其次，基于标准球模型提出了适用于苹果三视角图像的分割方法；最后，改进了YOLOv11基础模型，将其Neck部分的C3k2模块中的C3k替换为非局部自注意力残差多层感知机模块（non-local attention residual multi-layer perceptron，NARM)，构建了NARM-YOLOv11模型，从而提升检测精度和小尺度缺陷的识别能力。研究结果表明：所提出的三视角成像系统将苹果表面成像区域占比均值由单视角的34.6%提升至74.3%；苹果三视角图像分割方法平均可去除20.5%的冗余区域，平均缺陷检测重复率由初始图像的26.0%降到分割后的7.6%，平均漏检率为3.6%；NARM-YOLOv11模型相较于基础模型，精确率、召回率、平均精度均值分别提高2.7、2.5、3.4个百分点，模型的复杂度略有上升，帧率下降1.7帧/s；系统平均查准率为89.7%，平均缺陷识别率为88.1%。该研究有效克服了单视角成像检测苹果表面缺陷盲区大、三视角成像冗余高以及复杂背景下苹果缺陷特征难检测等问题，并形成一套适用于苹果表面缺陷可靠检测的系统，为苹果产业的智能化升级提供更为坚实的技术支撑。

Abstract: Apple surface defects have posed a serious threat to the grading efficiency and market value of the fruits. An accurate and rapid detection can be expected for the post-harvest quality of the fruits in recent years. However, the conventional single-view imaging has been widely adopted to only capture the partial surface information of the fruits, due to the irregular spherical shape of the apples. The missed detection of the defects can often occur in the uncollected areas. It is then required to detect the large-scale blind areas for the high-precision quality inspection in the intelligent upgrade of the apple industry. In this study, a three-view imaging method was proposed to detect the apple surface defects. The high redundancy in the multi-view imaging was also reduced to detect the features of the surface defects under the complex backgrounds. A reliable detection was realized using three key technical procedures. Firstly, an imaging dataset was collected for the apple surface defects. Three Intel RealSense D415 depth cameras were deployed to accurately obtain the imaging area. The hardware synchronization function of the cameras was also used to acquire the simultaneous data in real time. The RGB and depth images after data acquisition were converted into three-dimensional point clouds after internal parameter calibration. Then a series of point cloud processing steps were performed on the image data, including the preprocessing, coarse registration, fine registration, downsampling, and surface reconstruction. The 3D reconstruction of the apples was realized to calculate the area of the joint imaging region in the three-view system. Secondly, a region segmentation was proposed, suitable for the three-view apple images using a standard sphere model. The redundant background and overlapping regions were removed from the three-view images. Thirdly, the basic You Only Look Once version 11 (YOLOv11) model was improved to enhance the performance of the detection. Specifically, the C3k module in the Neck part of the original model was replaced with the Non-local Attention Residual Multi-Layer Perceptron (NARM) module. The NARM-YOLOv11 model was constructed to capture the long-range feature dependencies and then identify the small-scale defects. A series of experiments was carried out to verify the effectiveness of the improved model. The results showed that the three-view imaging was realized to fuse the multi-angle surface information of the apples after the precise point cloud registration and reconstruction, in terms of the imaging area performance. There was an increase from 34.6% of the single-view imaging to 74.3% in the average proportion of the apple surface imaging area. Among them, the detection blind area was significantly reduced to basically cover most of the apple surface. In the image segmentation with the standard sphere model, the redundant regions were effectively removed from the three-view images, with an average redundant region removal rate of 20.5%. Furthermore, the average defect detection repetition rate caused by overlapping imaging areas was reduced from 26.0% of the original images to 7.6% after segmentation. The average missed detection rate was controlled at 3.6%. The high redundancy was avoided in the multi-view imaging for the high accuracy of the subsequent defect identification. In the test of the improved NARM-YOLOv11 model, the precision, recall, and mean average precision (mAP) of the NARM-YOLOv11 model increased by 2.7, 2.5, and 3.4 percentage points, respectively, compared with the basic YOLOv11 model. As such, the NARM module was introduced to enhance the feature extraction, especially for the small-scale and low-contrast surface defects of the apples. The frame rate decreased only by 1.7 frames per second, thereby fully meeting the requirements of real-time detection in practical applications. The reason was that the model complexity shared a slight increase due to the addition of the attention mechanism and multi-layer perceptron structure. The overall performance of the detection was achieved for the detection of the apple surface defects. The average precision reached 89.7%, and the average defect recognition rate was 88.1%, indicating the high reliability and practicality of the integrated system. The three-view imaging, image segmentation, and improved NARM-YOLOv11 model were combined to detect the features under complex backgrounds, in order to avoid the large blind area of the single-view imaging and the high redundancy of the three-view imaging. The full-surface defect detection of the spherical fruits can provide a feasible technical scheme for the intelligent upgrading of the post-harvest quality inspection in the apple industry. The finding can also offer solid support to combine the multi-view imaging and deep learning in modern agriculture.

HTML全文

参考文献(35)

施引文献

资源附件(0)