Abstract:
Precise and efficient weed detection and removal can improve crop growth conditions, thereby enhancing crop yields and economic benefits. Addressing the issues of slow identification speed and low accuracy in distinguishing sugar beet seedlings from weeds in complex agricultural environments, this paper proposes a detection model based on an improved YOLO11n architecture—YOLO11-RFL (YOLO11 with Reparameterised Ghost-ELAN, Feature-Fused-Pyramid-Neck and Lightweight GN head). Firstly, drawing inspiration from GhostNet, a reparametrised Ghost-ELAN (RG-ELAN) module replaces YOLO11n's C3k2 component. This employs reparametrisation techniques at module branches to enhance feature extraction capabilities while reducing computational load. Secondly, we propose the Rethinking Features-Fused-Pyramid-Neck (RFPN) framework to refine the Neck network, addressing feature misalignment in multi-scale object detection to enhance real-time performance and efficiency. Thirdly, GroupNorm is employed to refine the convolutional layer within the detection head, designing the Lightweight Shared Convolutional Separator GN Detection Head (LSCSGND). This enhances small object localisation and classification capabilities, enabling more accurate weed detection; Finally, a knowledge distillation (teacher-student model) framework is introduced. Through a feature-based Channel-Wise Knowledge Distillation (CWD) strategy, it achieves synergistic optimisation of computational efficiency and accuracy for YOLO11-RFL. Experimental analysis conducted on the public LincolnBeet dataset demonstrates that compared to mainstream YOLO models, YOLO11-RFL achieves further enhanced network performance. Its precision and recall surpass those of YOLO11n, reaching 80.7% and 74.9% respectively. The mean average precision (mAP) at 0.5 and 0.5-0.95 stands at 80.7% and 56.9%. When evaluated on two distinct datasets—PDT and CWC—YOLO11-RFL demonstrated mAP
0.5 improvements of 0.7 and 0.1 percentage points over YOLO11n respectively, validating the model's robustness across diverse environments. The model comprises only 2.03 million parameters and requires 5.4 gigabytes of computational resources, representing reductions of 21.3% and 14.3% respectively compared to YOLO11n, demonstrating excellent engineering deployment feasibility. To comprehensively validate the inference efficiency and deployability of the YOLO11-RFL model in practical applications, field deployment and performance testing were conducted at an experimental base. Within a TensorRT-based inference deployment environment, the model achieved a real-time detection frame rate of 107.5 frames per second in experimental fields, confirming its outstanding real-time inference performance and high computational efficiency in practical settings. Additionally, 296 real-world test samples were randomly collected from multiple representative areas within the experimental base to evaluate the model's detection performance. Experimental results indicate that the baseline model YOLO11n achieved an mAP
0.5 of 80.4% on the test set, whereas the YOLO11-RFL model improved detection accuracy to 82.1% under identical conditions, confirming the proposed method's performance advantages in field environments. Furthermore, visualisation analysis of collected real-world images confirmed that the YOLO11-RFL model effectively mitigates accuracy degradation and object omission issues caused by environmental complexity. This research provides technical support for the precise identification and efficient management of weeds in sugar beet fields, holding significant implications for advancing intelligent agricultural production.