Abstract:
Accurate and timely detection of citrus leaf diseases in natural outdoor environments is critical for effective orchard management and sustainable agricultural production. However, this task faces significant challenges due to the small size of disease lesions, complex backgrounds with overlapping leaves and branches, and highly variable illumination conditions caused by weather, shadows, and sun exposure. To address these issues, this study proposes an improved YOLOv5-based algorithm specifically designed for robust citrus leaf disease detection under real-world conditions. The proposed method introduces three key enhancements to the original YOLOv5 architecture: a high-resolution detection head for small targets, an advanced attention mechanism for feature refinement, and an optimized loss function for precise localization.To improve the detection of small disease spots,often only a few pixels in size,a new high-pixel small target detection head, named
H0, is incorporated into the network. This head is connected to shallow layers of the backbone that preserve high spatial resolution, enabling the model to detect minute pathological features that are typically missed by standard detection heads. By fusing features from both the deep neck network and the shallow backbone layers, the model achieves enhanced multi-scale feature representation. This cross-level fusion strengthens the model’s ability to recognize small lesions while maintaining contextual awareness, significantly improving detection sensitivity for early-stage diseases. To enhance the model’s discriminative power in complex scenes, an improved attention module called MR-CBAM (multi-scale fusion residual structure convolutional block attention module) is introduced in the feature extraction stage. Unlike the standard CBAM, which applies channel and spatial attention independently, MR-CBAM integrates a multi-scale residual block that processes input features through parallel convolutional paths with varying kernel sizes. This allows the model to capture contextual information at different scales, effectively distinguishing subtle disease patterns from background noise such as leaf veins or soil. The fused multi-scale features are then refined by the CBAM structure, which recalibrates feature maps by emphasizing informative channels and spatial regions. The residual connection ensures stable gradient propagation, facilitating training convergence and preserving fine details. This design significantly improves the model’s robustness under challenging lighting conditions, such as overexposure or low-light scenarios.To achieve more accurate object localization, especially for irregularly shaped lesions,the GIoU(generalized intersection over union)loss is adopted as the bounding box regression loss. GIoU considers both the overlap and the distance between predicted and ground-truth boxes, providing more meaningful gradients during training, even when there is no intersection. This leads to faster convergence and more precise bounding box predictions, which is crucial for accurate disease localization in cluttered environments.To validate the proposed method, a comprehensive citrus leaf disease dataset was constructed, including citrus canker, greasy spot, and scab,collected under diverse natural conditions. Experimental results demonstrate that the improved model achieves AP(average precision), recall, mAP
50, and mAP
50:95 of 91.5%, 90.2%, 89.8%, and 86.7%, respectively. Compared to the original YOLOv5, this represents improvements of 2.1 percentage points, 2.6 percentage points, 1.6 percentage points, and 1.4 percentage points, respectively. These results confirm the effectiveness of the proposed enhancements in boosting detection accuracy and reliability, offering a promising solution for practical citrus disease monitoring systems.