基于改进ResNet18的玉米种子细粒度图像分类方法

李鸿强; 张超; 张栋; 张诗欣; 李民赞

doi:10.11975/j.issn.1002-6819.202504228

基于改进ResNet18的玉米种子细粒度图像分类方法

A fine-grained image classification method for corn seeds based on improved ResNet18

摘要

摘要: 针对玉米种子细粒率图像分类准确度低的问题，该研究对ResNet18模型进行改进优化，以提高玉米种子细粒度图像分类精度。首先，引入路径增强网络（path aggregation network，PANet），以提高模型对玉米细粒度图像特征融合能力；其次，构建强化-互补学习网络（reinforcement & complementary Network，RCNet），提升局部和边缘特征的提取能力；最后，引入协同注意力特征融合结构（collaborative attention feature fusion，CAFF），将RCNet提取的特征进行自适应加权融合，提升模型对整体特征的关注度。试验结果表明：改进后的ResNet18模型的准确率、精确率、召回率、加权分数(F1-score)分别为98.78%、96.62%、99.17%、97.88%，分别比原始模型高出4.28、4.11、4.29和4.20个百分点，推理速度为104帧/s，模型大小为105.2 MB。并将模型部署到移动端，改进的ResNet18模型基于移动端能够适应复杂背景下的玉米种子识别，识别准确率均超过95%，推理速度最低为257 ms，满足实时预测要求，在准确率和模型稳定性上表现优异。研究成果可为种子细粒度图像分类问题提供技术参考。

Abstract: An accurate and rapid classification is often required for the fine-grained images of the corn seeds. In this study, the ResNet18 model was optimized to improve the classification precision. Firstly, a Path Aggregation Network (PANet) was introduced to fuse the fine-grained features of the corn images. Secondly, a Reinforcement & Complementary Network (RCNet) was constructed to extract the local and edge features. Finally, a Collaborative Attention Feature Fusion (CAFF) structure was introduced to adaptively weight and fuse the features extracted by RCNet. As such, the model's attention was enhanced to the overall features. Experimental results demonstrate that the improved ResNet18 achieved the remarkable performance: the accuracy of 98.78%, precision of 96.62%, recall of 99.17%, and F1-score of 97.88%, which were 4.28, 4.11, 4.29, and 4.20 percentage points higher than those of the original ResNet18, respectively. Its inference speed reached 104 frames/s, and the model size was 105.2 MB, indicating a favorable balance between classification precision and computational efficiency. Ablation experiments further clarified the contribution rates of each module: the PANet was to enhance the bidirectional feature propagation between high-level semantic information and low-level details (such as seed surface textures and edge contours). The accuracy increased by 1.11 percentage points, thus proving its effectiveness in the multi-scale feature fusion. RCNet was to highlight the critical local features (e.g., unique hilum structures) in the Key Region Mask (KRM) module. While the complementary edge information was captured for the significant improvements—accuracy, precision, recall, and F1-score rose by 3.32, 3.35, 2.51, and 2.92 percentage points, respectively. CAFF adaptively weighted the features from RCNet to suppress the irrelevant noise. An additional 0.48 percentage points of the accuracy was enhanced to combine with the RCNet. The synergistic effect of these three modules was used to explain why the integrated model outperformed the single-module. Comparative experiments with the rest models validated their superiority: ResNet34 (94.60%) and ResNet50 (94.87%) showed lower accuracy, due to their inadequate focus on the subtle feature differences, despite deeper architectures. A lightweight model, EfficientNet-v2 was achieved 95.19% accuracy, but lagged in distinguishing visually similar varieties. WS-BAN (95.82%) and DBTNet (96.31%) were designed for the fine-grained tasks, indicating higher accuracy than ResNet variants but suffered from the slower inference (92 and 41 frames/s) and larger sizes (179.0 and 117.5 MB). In contrast, the improved ResNet18 significantly reduced the misclassification rates for the highly similar varieties: TC705 and Y348 shared the 31 and 40 misclassifications in the original ResNet18, as the reductions were to 4 and 16, with the drop rates of 87.1% and 60%, respectively. The strong performance was then achieved to capture the fine-grained features. Mobile deployment tests verified its practical value. The model maintained the high accuracy above 95% (99%, 97%, and 96%, respectively) in the complex backgrounds (palm for GS6H, green plastic for TC705, wooden table for Y348). The minimum inference speed was 257 ms, thus fully meeting the real-time requirements even on mobile devices with limited hardware. The stability of the model was attributed to the robust feature extraction under varying lighting and background interference. The on-site identification was realized without relying on professional equipment, thus facilitating the rapid, on-field seed variety verification. The improved ResNet18 effectively solved the challenge of classification on the fine-grained corn seeds. The finding can also provide a reliable technical reference for the intelligent seed identification in agricultural production.

HTML全文

参考文献(31)

施引文献

资源附件(0)