Abstract:
To address the pronounced scale variation of disease and pest targets, weak feature representation of small objects, and the limitations of edge deployment in natural cotton-field scenes, a lightweight cotton disease and pest detection model, GSBS-YOLOv8, was proposed on the basis of You Only Look Once version 8 (YOLOv8). The objective was to improve detection accuracy under field conditions while substantially reducing model complexity, computational burden, and deployment cost. In the model design, GSConv and Ghost C2f modules were introduced into the backbone network and feature fusion stages to replace part of the conventional convolutional structure, thereby reducing parameter redundancy, improving feature reuse, and lowering computational overhead without weakening semantic extraction capability. A Bidirectional Feature Pyramid Network (BiFPN) was further embedded into the neck to strengthen the bidirectional propagation and weighted fusion of shallow spatial information and deep semantic information, so that fine-grained cues of small disease spots and pests at different imaging distances were more effectively retained. In addition, the Shape-IoU loss was adopted for bounding-box regression to enhance geometric matching between predicted boxes and target contours, which improved localization robustness for small, slender, and irregular targets under complex canopy backgrounds. Experimental results showed that the proposed model achieved a 1.0 percentage point increase in mean Average Precision (mAP) over the baseline YOLOv8 model, indicating that the lightweight reconstruction not only preserved detection capability but also improved the recognition of multi-scale disease and pest targets. Meanwhile, the number of model parameters decreased from 3.01 million to 1.19 million, corresponding to a reduction of 60.5%, while Floating Point Operations (FLOPs) decreased from 8.2 G to 4.7 G, corresponding to a reduction of 42.7%. The model size was compressed from 5.8 MB to 2.7 MB, representing a reduction of 53.4%, which markedly lowered storage and transmission burden and improved the feasibility of deployment on resource-constrained hardware platforms. The simultaneous improvement in accuracy and compression efficiency showed that the optimized architecture enhanced feature representation efficiency and alleviated the common trade-off between lightweight design and detection performance. In the inference evaluation, the model reached 101.3 frames per second on a Graphics Processing Unit (GPU), demonstrating strong real-time performance in a high-throughput computing environment. After TensorRT acceleration on the Jetson Xavier NX platform, the inference speed still reached 43.7 frames per second, which confirmed that the model maintained efficient execution under embedded conditions and satisfied the practical requirement of real-time field monitoring. The overall results showed that the coordinated introduction of lightweight convolutional design, ghost feature generation, bidirectional multi-scale fusion, and improved regression loss effectively balanced accuracy and efficiency, enhanced small-object perception and localization performance, and avoided the substantial decline in detection capability that often accompanied model compression. Therefore, GSBS-YOLOv8 provided an effective lightweight solution for intelligent edge-side monitoring of cotton diseases and pests in natural field environments, and the study offered technical support for subsequent applications in rapid field scouting, precision prevention and control, and low-power agricultural vision systems.