Abstract:
To address the challenges of low efficiency in traditional manual detection of strawberry fruit and leaf diseases and susceptibility to environmental constraints, this study proposes an intelligent detection model named YOLOv8n-SFLD based on an improved YOLOv8n framework integrated with multi-scale collaborative attention mechanisms. A high-quality image dataset of 8,745 images covering seven common strawberry disease phenotypes (angular leaf spot, anthracnose, blossom blight, gray mold, leaf spot, powdery mildew on leaves and fruits) was constructed from Kaggle platform and field collections, augmented with geometric transformations, illumination adjustments, noise addition, occlusion simulation, and Mosaic enhancement to improve generalization. A four-fold optimization strategy was applied to the baseline YOLOv8n: First, the Universal Receptive Large Kernel Network (UniRepLKNet) module was integrated into the C2f structure to enhance deep semantic feature extraction. Second, the Channel Prior Convolutional Attention (CPCA) was introduced into the backbone to improve lesion detection accuracy under complex backgrounds. Third, a Multi-Scale Collaborative Attention (MSCA) with parallel 3×3, 5×5, and 7×7 convolutional branches was incorporated into the neck to capture multi-scale lesion features. Fourth, the Wise-IoU loss function replaced CIoU to optimize bounding box regression for small targets via dynamic non-monotonic focusing. Experimental results demonstrate that YOLOv8n-SFLD achieves superior performance. Attention mechanism comparison shows that embedding CPCA in the backbone yields the best precision (88.6%), while MSCA in the neck achieves the highest mAP50 (87.1%); their combination further improves precision and mAP50 by 3.8 and 2.1 pp respectively. Dataset analysis reveals significant performance variation across categories: blossom blight, healthy fruit, angular leaf spot, and leaf spot achieve mAP50 of 98.9%, 99.5%, 90.1%, and 95.0% with miss rates below 12.5%, while anthracnose and healthy leaf show lower mAP50 (62.8%, 58.8%) and higher miss rates (44.0%, 37.2%) due to sample imbalance. Ablation experiments confirm synergistic effects of all modules: the complete YOLOv8n-SFLD (C2f-URBlock+CPCA+MSCA+Wise-IoU) attains precision of 91.9%, mAP50 of 87.0%, and mAP50-95 of 67.5%, improving by 7.4, 2.3, and 1.5 pp over baseline, while maintaining model weight of 13.4 MB and FLOPs of 16.0 G. Comparative experiments with RT-DETR, YOLOv8n/s, YOLOv9t, YOLOv10n, YOLOv11, and YOLOv12 show YOLOv8n-SFLD outperforms all in precision (5.2–10.6 pp higher) and mAP50 (2.0–5.8 pp higher). Visualization analysis confirms its superiority: in gray mold detection with multi-disease co-infection, it accurately detects all lesions (confidence 0.94) while others misdetect dried leaves or miss fruit powdery mildew; in leaf spot with dense small lesions, it achieves complete coverage (confidence 0.90–0.94) with tightly fitting boxes; in powdery mildew under low-contrast backgrounds, it maintains high precision (confidence >0.94) and accurate localization. These advantages stem from the synergistic optimization: C2f-URBlock enhances deep feature extraction, CPCA strengthens lesion discrimination under complex backgrounds, MSCA adapts to multi-scale variations, and Wise-IoU improves small-target regression. The proposed YOLOv8n-SFLD effectively addresses challenges of multi-scale lesion variation, low-contrast features, and complex background interference, achieving an optimal balance between accuracy and efficiency for deployment on resource-constrained edge devices. It provides reliable technical support for precision pesticide application and smart agriculture monitoring, and offers a valuable reference for disease detection in other crops. Future work will focus on dataset expansion, model compression for mobile deployment, and integration with UAVs for real-time monitoring systems.