基于改进YOLOv10n的动态玉米种子检测与计数方法

徐常塑; 陈浪; 柳鑫智; 李明生; 唐汉; 张建军; 刘得雄; 李云伍

doi:10.11975/j.issn.1002-6819.202511063

基于改进YOLOv10n的动态玉米种子检测与计数方法

Dynamic maize seed detection and counting method based on improved YOLOv10n

摘要

摘要: 针对多品种玉米种子密集分布、形态小且差异细微以及运动模糊等特点导致检测精度和统计计数准确率低的问题，该研究提出了一种基于改进YOLOv10n的传送带动态玉米种子检测方法和实时计数方法。检测模型采用感受野注意力卷积（receptive field attention convolution, RFAConv）替代主干网络中的标准卷积，增强对玉米种子关键特征提取能力；在模型浅层特征提取阶段引入重参数化多元分支模块（diverse branch block, DBB），以提升重要特征的提取和表达能力；利用自适应下采样（adaptive downsampling, ADown）模块来替换原有下采样模块，充分保留特征图的空间细节和局部纹理信息。设计FPIoU-v2损失函数，改善了传统IoU系列损失在正负样本不均衡场景下的收敛性。结果表明，所提出的YOLO-corn模型在玉米种子检测测试集上的精确率、召回率、平均精度均值分别达到89.2%、88.4%、94.0%，相较于YOLv10n分别提高2.1、2.1和1.6个百分点；与主流算法YOLOv3-tiny、YOLOv5n、YOLOv6n、YOLOv8n、YOLOv9t、YOLOv11n相比平均精度、召回率和精确率分别提升0.8～3.9、0.9～5.8和0.3～3.6个百分点。基于YOLO-corn模型，结合BoT-SORT多目标跟踪算法和画线计数方法，实现对传送带玉米种子检测和实时计数的功能，在低速区间（0.1～0.3 m/s）计数平均准确率达89.3%以上，中速区间（0.4～0.5 m/s）达81.5%以上。本研究可为多类型种子在线计数分选提供技术支持。

Abstract: Maize is one of the most significant cereal crops worldwide. It is often required to realize the seed purity and accurate maize counting for high-quality control in modern production. However, conventional computer vision cannot fully meet the requirements of the large-scale performance during conveyor-belt inspection, including the spatial dense distribution of multiple seed varieties, subtle morphologies among categories, and motion-induced blur under the relative movement between the camera and the seeds. In this study, an improved YOLOv10n architecture was proposed to dynamically detect and then count maize seeds in real time. Specifically, the framework was also optimized for high-performance detection on the moving conveyor belts. The YOLO-corn detection model significantly enhanced the baseline YOLOv10n architecture using four structural improvements. 1) RFAConv (Receptive Field Attention Convolution) was integrated into the redesigned C2fE modules. The feature smearing effect of motion blur was mitigated in the standard convolutions with fixed parameters. Spatial features within the receptive field were adaptively re-weighted to concentrate on discriminative micro-textures, such as the seed embryos and grain contours. 2) Diverse Branch Block (DBB) was incorporated to extract the shallow feature. A multi-branch topology during training was utilized to capture diverse scale-space information after structure re-parameterization. Subsequently, a single kernel of inference was fused to enhance local edge perception with less computational overhead. 3) The lightweight ADown down-sampling module was utilized to avoid the loss typical of conventional pooling. The spatial fidelity was combined with the parallel average pooling and strided convolution. Structural cues were then preserved throughout the hierarchy. 4) A composite FPIoU-v2 loss function was proposed to accelerate convergence. The segmented linear re-weighting of Focal-IoU was coupled with the pixel-level boundary sensitivity of PIoU. The loss function was effectively recalibrated for the hard-to-detect overlapping samples after training. The BoT-SORT algorithm was also implemented for the temporal tracking task. Camera motion compensation was used to filter out mechanical vibrations, while an improved Kalman filter was used for smoother trajectory estimation. A virtual line-crossing logic was further integrated to map trajectories into discrete counts, effectively neutralizing redundant counts by tracking ID instability. Experimental evaluations demonstrated that the YOLO-corn framework substantially outperformed existing benchmarks. According to a self-curated dynamic dataset with five varieties of maize seed, the better performance of the model was achieved in a Precision of 89.2%, a Recall of 88.4%, and an mAP@0.5 of 94.0%. Compared with the original YOLOv10n, the performance was improved by 2.1, 2.1, and 1.6 percentage points, respectively, while the number of parameters increased by only 0.17 M. In terms of throughput, a high inference speed of 110.5 and 49.5 FPS on a high-performance workstation and an NVIDIA Jetson Nano after TensorRT optimization, respectively, indicates its readiness for edge deployment. Ablation studies confirmed that the synergistic interaction between RFAConv and DBB was crucial to reducing the motion-induced blurring. Furthermore, counting experiments revealed that an average accuracy exceeded 89.3% at low-to-medium velocities (0.1-0.3 m/s) and stayed above 81.5% even at the increased speeds (0.4~0.5 m/s), indicating strong operational robustness. The YOLO-corn framework can offer a robust, high-accuracy, and real-time solution for the online monitoring of the moving maize seeds. These findings can provide a technical foundation for the agricultural inspection to balance the architectural complexity with detection fidelity. Physics-based deblurring pre-processing and 3D point-cloud data can be expected to resolve extreme occlusion and stacking in high-speed industrial environments.

HTML全文

参考文献(31)

施引文献

资源附件(0)