高级检索+

基于改进YOLOv11n的玉米根茬检测算法

Detecting maize root stubble using improved YOLOv11n

  • 摘要: 西北旱作农业区全膜双垄沟播“一膜两年用”模式下,精确检测玉米根茬是实现避茬播种智能化的关键技术,而玉米根茬的识别面临地膜干扰与根茬形态不导致的背景复杂等难题。为进一步提升玉米根茬检测的精确率和平均检测精度,该研究基于YOLOv11n模型提出改进的TriLightNet-YOLO算法,通过构建上下文引导模块Edge Fusion Stem,将传统Sobel算子重构为3D深度可分离卷积,增强模型对小目标关键细节特征的传递能力;引入特征提取DSBNCSPELAN4融合模块,有效提升模型对目标细节的捕捉能力,进一步提升模型整体性能;新增Grouped VoVGSCSP HHF Fusion分组小波特征交互模块,通过Haar小波变换实现多尺度特征分解与跨层融合,强化目标信息的同时降低模型对噪声的敏感性。试验结果表明:在模型预测的目标框与真实目标框的交并比大于50%时,改进模型的平均检测精度达92.8%,较基准模型YOLOv11n提升1.7%,精确率和召回率分别提升1.5%和1.0%,浮点计算量降低7.8%,帧率提升28帧/s,参数量仅增加0.4M。该算法为机械化避茬播种作业提供理论基础和技术支持。

     

    Abstract: In the "one-film-for-two-years" mode of full-film double-ridge furrow sowing in the dry-farming agricultural areas of Northwest China, maize root stubble recognition faces challenges such as similar morphologies between residual plastic film and stubble, as well as complex backgrounds. Accurate detection of maize root stubble is a key step to realize intelligent stubble-avoiding sowing. To improve the precision and average detection accuracy of maize root stubble detection, this study proposes an improved TriLightNet-YOLO algorithm based on the YOLOv11n model. To address this, this study takes YOLOv11n as the benchmark and proposes an improved TriLightNet-YOLO model, which breaks through technical bottlenecks through three core modules: First, a context-guided module (Edge Fusion Stem) is designed, which reconstructs the traditional Sobel operator into 3D depthwise separable convolutions. The pooling branch adopts a special padding strategy to effectively alleviate the damage to edge information caused by conventional pooling operations. This not only retains the edge detection characteristics but also enhances the transmission of key details of small targets while reducing computational overhead. Second, the DSBNCSPELAN4 module is constructed to replace the original C3k2 structure: it first compresses channels via 1×1 convolutions and is split into two branches through the Split operation — the main branch retains original features, while the second branch expands the receptive field and fuses local features via the DSBNCSP module plus 3×3 convolutions, then splits into two sub-branches: one directly performs feature concatenation, and the other undergoes the DSBNCSP module plus 3×3 convolutions again for concatenation. The module incorporates Dilated Separable Convolution to adapt to soil-covered scenarios, and adopts reparameterization technology in the inference stage to balance real-time performance and texture feature capturing capability. Third, a grouped wavelet feature interaction module (Grouped VoVGSCSP HHF Fusion) is introduced. Based on Haar wavelet transform, it realizes the decomposition of low-frequency contours and high-frequency edges of features. After processing high-frequency information through residual blocks, the inverse transformation is performed for fusion, significantly reducing noise interference such as soil cracks and residual stalks. Experiments were conducted based on a dedicated corn stubble dataset: this dataset contains 2,033 images, collected by DJI Mini3 and Huawei Mate60 in plots around Lanzhou at heights such as 0.3m and 0.9m, as well as at a 60° tilt angle. It covers three types of targets: residual film, soil mulch layer, and stubbles (11,081 annotations). After online enhancement such as motion blur, the dataset was divided into training, testing, and validation sets in a 7:2:1 ratio. The results show that the mAP@0.5 of the improved model reaches 92.8%, which is 1.7% higher than that of YOLOv11n, with precision and recall rates reaching 90.6% and 87.4% respectively; the floating-point operations (GFLOPs) are reduced to 5.9 G, the number of parameters only increases by 0.4M, and the frame rate is increased to 156 frames per second. Its comprehensive performance is superior to that of mainstream lightweight models such as YOLOv5n and YOLOv8n. In the deployment test on the Orange Pi 5 MAX edge device, the model still maintains a precision of 86.4%, a recall rate of 83.7%, and a frame rate of 25.8 frames per second. This provides an efficient visual solution for intelligent stubble-avoiding sowing in film-mulched agriculture, contributing to the upgrading of agricultural mechanization in arid areas.

     

/

返回文章
返回