基于改进YOLOv10n的复杂田间小麦幼苗检测和计数方法

李慎可; 朱俊科; 黄傲群; 唐志成; 杨宁; 杜常亮; 朱雨昕; 兰玉彬

doi:10.11975/j.issn.1002-6819.202511204

基于改进YOLOv10n的复杂田间小麦幼苗检测和计数方法

Detecting and counting wheat seedlings under complex field conditions using improved YOLOv10n

摘要

摘要: 为实现田间复杂环境下小麦幼苗的精确识别，该研究提出一种基于YOLOv10n的改进算法DSM-YOLOv10（dualconv-sdi-msda YOLOv10）。首先，引入双卷积结构（dual convolution，DualConv）构建DCC2f模块，替代原主干网络中的C2f结构，以减少冗余计算并增强特征融合能力；其次，针对小麦幼苗分布密集、相互遮挡等检测难点，采用语义细节融合模块（semantics and detail infusion，SDI）替换Concat操作，实现语义指导下的细节增强，以提升模型对小麦幼苗重叠区域特征的解耦与复用能力；最后，引入多尺度空洞注意力机制（multi-scale dilated attention，MSDA）强化多尺度特征表达，以提升模型对小麦幼苗交叉重叠的理解能力。试验结果表明，DSM-YOLOv10算法能够针对多种复杂场景下的小麦幼苗实现精准检测，平均精度均值、精确率、召回率和F1分数分别达到91.4%、85.2%、81.7%和83.4%，较原YOLOv10n模型分别提高5.0、2.6、5.4和4.1个百分点，模型参数量和浮点运算量分别减少4.7%和10.7%，推理时间降至13.2 ms。在幼苗计数任务中，模型检测结果与实测数据相比，决定系数、均方根误差和平均绝对误差分别为0.92、5.68株和4.33株，性能均优于YOLOv10n等主流模型。本研究有效提升了复杂场景下小麦幼苗的检测与计数能力，为农业实践中的田间数据获取提供了有力支撑。

Abstract: Wheat seedling counting is one of the most crucial field surveys at the emergence stage. However, the challenges remain, such as small size, dense distribution, and mutual occlusion of wheat seedlings. This study aims to accurately detect and count the wheat seedlings in complex field environments. An improved algorithm (named DSM-YOLOv10) was proposed using YOLOv10n. Firstly, global annotation was adopted rather than only local features during data labeling. The overall features of wheat seedlings were learned and extracted to effectively avoid the feature loss caused by overlapping and occlusion. Secondly, a DCC2f module was constructed to replace the C2f module with the dual convolution (DualConv) in the backbone network. Grouped and pointwise convolutions were concurrently deployed to combine with the residual connections. The gradient vanishing was alleviated in deep-layer training. A more complete transmission of cross-layer features was realized to minimize the redundant convolution. Computational efficiency was improved to facilitate subsequent real-time deployment on mobile edge devices. More efficient features were provided for the subsequent processing. Furthermore, a semantics and detail infusion (SDI) module was introduced to explicitly inject the high-level semantic information into low-level detail features using cross-layer attention and Hadamard products. Detail enhancement was obtained under semantic guidance to avoid simple feature stacking without deep information interaction in the conventional concatenation modules. Consequently, the performance was improved to decouple and reuse features in the overlapping regions of wheat seedlings. Finally, a multi-scale dilated attention (MSDA) mechanism was adopted to incorporate the multiple dilation rates. Multi-scale semantic information was effectively aggregated to integrate the local details with contextual information. Furthermore, the intersecting and overlapping wheat seedlings were enhanced after extraction. Moreover, the complex redundancy or additional computational cost was efficiently reduced in the self-attention mechanism. Experimental results demonstrate that the DSM-YOLOv10 model achieved precise detection of wheat seedlings under multiple complex scenarios. The mean average precision (mAP), precision (P), recall (R), and F1 score were 91.4%, 85.2%, 81.7%, and 83.4%, respectively, which was improved by 5.0, 2.6, 5.4, and 4.1 percentage points, respectively, compared with the original YOLOv10n model. Furthermore, the parameter count and floating point operations (FLOPs) were reduced by 4.7% and 10.7%, respectively. The real-time detection was also realized with an inference time of 13.2 ms (approximately 76 frames per second). Compared with the RetinaNet, Faster-RCNN, SSD, and YOLOv8n models, the DSM-YOLOv10 model exhibited the best performance in detection. The mAP values were 35.8, 24.5, 30.7, and 8.6 percentage points higher, respectively; whereas, the number of parameters was reduced by 87.4%, 96.2%, 90.0%, and 15.2%, respectively; and the FLOPs were decreased by 86.3%, 96.0%, 88.1%, and 7.4%, respectively. The DSM-YOLOv10 model also demonstrated excellent recognition of the small objects, with a mAP of 86.3%, particularly in overlapping seedling scenarios. Both its missed detection rate and false detection rate remained at the low levels, at 9.1% and 3.8%, respectively. The coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE) were 0.92, 5.68 plants, and 4.33 plants, respectively, in the seedling counting task using the DSM-YOLOv10 model. Compared with the YOLOv10n, the R² increased by 6.98%, while the RMSE and MAE decreased by 35.23% and 40.44%, respectively, indicating the higher counting accuracy and robustness. This finding can effectively improve the detection and counting performance of wheat seedlings in complex scenarios. The improved model was also featured by fewer parameters and lower FLOPs. The finding can provide support for field data acquisition in smart agriculture.

HTML全文

参考文献(34)

施引文献

资源附件(0)