Abstract:
Precise and rapid detection of the grape fruit and leaf diseases is of vital importance to the stable development of the grape industry. The growth environment of the grapes is also dominated by various highly complex factors, such as the climate and soil. Meanwhile, there are the numerous types of the diseases, with the complex and interwoven symptoms. However, traditional detection cannot fully meet the requirements to distinguish different diseases, due mainly to the frequent misdetections and missed detections, thus leading to the low yield and quality of grapes. In this study, a novel model was proposed to detect the grape fruit and leaf disease using the improved YOLO11n - RLSL-YOLO11n. Four common and highly harmful diseases were taken as the grape botrytis cinerea, downy mildew, powdery mildew and fruit cracking. According to the efficient detection of the YOLO series algorithms, the RLSL-YOLO11 model was achieved in the better performance after a series of the improvements. Firstly, the RepConv convolution module was introduced into the backbone network. A multi-scale fusion backbone network, RHGNetv2 was established using the HGNetv2 architecture. The performance of the improved model was enhanced to capture the disease features at different scales. After that, the disease features were more comprehensively perceived to effectively reduce the number of parameters and computational complexity of the lightweight model after optimization on the network structure. Secondly, an SPPF (Spatial Pyramid Pooling - Fast) module was obtained to further enhance the feature extraction of the improved model using the LSKA (Large Separable Kernel Attention) attention mechanism. The LSPF module was then introduced the large-scale separable convolution and attention mechanism, in order to focus more on the feature extraction of the disease area. At the same time, the features of the diseases were recognized and distinguished in the complex backgrounds, in order to effectively reduce the background interference. Furthermore, the Slim Neck architecture was adopted to optimize the Neck feature fusion network neck of the YOLO11n model. The feature fusion path was simplified to reduce the redundant calculations. A high recognition accuracy rate was maintained to further reduce the computational complexity for the operational efficiency of the improved model. Finally, a lightweight shared convolution separator batch normalized detection head (LSCSBND) was designed to further enhance the lightweight degree of the model. The detection head was effectively reduced the number of parameters and computational complexity. The shared convolution kernels and batch normalization were simultaneously improved to locate and then extract multi-scale disease features. A dataset was constructed to verify the performance of the RLSL-YOLO11 model. Specifically, 2,449 original images were contained in the four kinds of grape fruit and leaf diseases. The RLSL-YOLO11 model was achieved in the accuracy rate, recall rate, mAP0.5 and mAP0.5-0.95 of 82.8%, 76.1%, 83.1% and 52.0%, respectively. The mAP0.5 and mAP0.5-0.95 of YOLO11n were improved by 1.9 and 6.0 percentage points, respectively, compared with the baseline model. At the same time, the number of model parameters, computational complexity and model weights were reduced by 12.0%, 14.2% and 8.9% respectively. This finding can provide a new solution to the precise detection of the grape fruit and leaf diseases. Strong support can also offer for the lightweight deployment and practical application of the disease detection in modern agriculture.