Detecting apple appearance quality under natural scenes using improved YOLOv8n
-
Graphical Abstract
-
Abstract
Here, the ZAL-YOLOv8n model was proposed to rapidly and accurately detect the apple appearance quality in natural scenes using the improved YOLOv8n model. The baseline model was selected as the YOLOv8n model, the lightest model in the YOLOv8 series. Subsequently, the improved model was deployed on the mobile edge devices, despite its relatively lower accuracy of recognition. The apple appearance quality was classified into three categories: Immature, Mature, and Fall ill, according to the influencing factors, such as the fruit maturity and health. When the fruit with the greenish-blue peel was marked as “Immature”. Once the peel was red or a mix of red and yellow, the fruit was marked as “Mature”. If there were disease spots on the peel, the fruit was marked as “Fall ill”. Two common apple diseases, namely “Ring rot” and “Scab”, were further identified from the disease category. Firstly, the partial convolution (PConv) and the efficient multi-scale attention mechanism (EMA) were integrated as the EP-C2f module, in order to replace the C2f module in the backbone network. The EMA attention mechanism was employed to redistribute the attention weights, and then group all channels with the same number of channels. Effective information on all channels was retained to realize the interaction between channel and spatial position. Thereby the target area was focused on enhancing the feature extraction for apples under complex occlusion. Simultaneously, the dynamic convolution of the PConv was used to filter out the effective feature data. The model size and parameters were reduced after extraction. Secondly, the MPDIoU was introduced as the boundary regression loss function, in order to realize the accurate positioning of apple skin lesion areas in diseased apples. The reason was that the original boundary box regression loss function (CIoU) of the YOLOv8 model failed to realize the rapid fitting, where the types and degrees of pathogen infections varied among diseased apples. The new position fitting was accelerated to distinguish and locate the lesion edges, in order to minimize the distance between the upper left and lower right corners of the predicted and the real box. Finally, the Slim-neck architecture was utilized to reconstruct the feature fusion network of the YOLOv8 model. The lightweight of the neck network was achieved to enhance the operating speed. The experimental results indicate that the accuracy, recall, and mean average precision of the ZAL-YOLOv8n model increased by 3.4, 1.1, and 1.3 percentage points, respectively. Meanwhile, the floating-point operations, parameter quantity, and model size were reduced by 22.2%, 17.7%, and 15.9%, respectively, fully meeting the deployment requirements for the mobile edge devices. The ZAL-YOLOv8n model was performed on the highly precise detection and quality identification of the apples with different appearance qualities in natural scenes. The high degree of lightweight was also achieved to balance between the accuracy and speed, in order to realize the real-time detection on the low-computing-power devices. Therefore, the improved model can also provide technical support to the research and development of intelligent robots for apple picking.
-
-