Abstract
Flax, also known as oilseed flax, is one of the most important economic crops in northern China, due to its short growth cycle and broad adaptability. However, the frequent occurrence of diseases, pests, and weeds has severely reduced flax yield and quality under global warming. Conventional manual detection is also time-consuming, labor-intensive, and prone to subjective errors. Computer vision and deep learning have made significant advancements in pest detection. Existing models have also posed great challenges, such as low recognition accuracy for flax-specific biotic stresses, large computational complexity, and low adaptability to resource-constrained scenarios. Most models focus only on diseases or pests in recent years. In this study, a lightweight target detection model was proposed to simultaneously identify the flax diseases, pests, and weeds using improved YOLOv11n. A dataset was also constructed after field collection in two major flax-producing regions of Gansu Province, China. 1296 high-resolution (4 032×3024) original images were captured using iPhone 14 ProMax and iPad Pro devices, thereby covering 8 categories: 5 pests (Meloidae, Shield bug, Colasposoma dauricum, Formicidae, and Helicoverpa armigera), 2 diseases (Verticillium dahliae, and Yellow leaf disease), and 1 weed category (Chenopodiaceae-dominated weeds). After data preprocessing to remove blurred or occluded images, offline augmentation techniques (rotation, flipping, random cropping, brightness/contrast adjustment, and Gaussian blur) and online Mosaic augmentation were applied to expand the dataset to 7 146 images, which were split into training (5 002), validation (1 429), and test (715) sets at a 7:2:1 ratio. Three improvements were implemented to optimize the YOLOv11n model: First, the original C3k2 module was replaced with a lightweight C3k2_S (C3k2_Star) module in the backbone network. Bottleneck layers are also substituted with StarBlock, indicating a star-shaped topology with multi-branch feature interaction. Initial lightweighting was then achieved to maintain feature extraction. Second, a neck network, named DWE_BiFPN (Depthwise Separable Convolution and Efficient Channel Attention-Bidirectional Feature Pyramid Network), was designed to integrate BiFPN for bidirectional multi-scale feature fusion. Standard convolutions with depthwise separable convolutions (DWConv) were replaced to reduce computational overhead. Efficient Channel Attention (ECA) mechanisms were added at the P3, P4, and P5 feature layers for the high sensitivity to critical features. Third, the conventional detection head was replaced with a lightweight LSCD (Lightweight Shared Convolutional Detection) head. Grouped convolutions and parameter sharing were used to reduce model complexity while fitting into the improved lightweight architecture. Experimental results demonstrated that the improved model achieved significant performance gains. On the self-built dataset, the precision, recall, and mean average precision (mAP@0.5) reached 89.3%, 91.6%, and 88.1%, respectively, which increased by 1.8, 5.4, and 2.8 percentage point, compared with the baseline YOLOv11n. In terms of resource efficiency, the parameters were reduced to 1.6M (38.5% reduction), computational complexity to 5.1GFLOPs (20.3% reduction), and model size to 3.4MB (34.6% reduction), with a detection speed of 98.6 frames per second (23.53% improvement). Comparative analysis with mainstream models (Faster R-CNN, SSD, YOLOv5n, YOLOv8n, and YOLOv12n) showed that the improved model outperformed competitors in both accuracy and lightweight performance, particularly in small target detection and complex background suppression, as verified by GradCAM++ heatmap visualization. The model was evaluated on a public wheat disease dataset, thus achieving 83.2% precision, 85.9% recall, and 79.6% mAP@0.5—outperforming the baseline YOLOv11n by 4.4, 5.6, and 4.3 percentage point, respectively. Generalization was validated for different crop disease detection. In conclusion, the improved YOLOv11n model achieved a balance between detection accuracy and lightweight, suitable for real-time deployment on resource-constrained devices in field environments. The finding can provide a scientific basis for the precise identification and green control of flax diseases, pests, and weeds. Future work can be expected to expand the dataset into more flax biotic stress for the high robustness under extreme field conditions, such as heavy occlusion.