Abstract:
An accurate and rapid classification is often required for the fine-grained images of the corn seeds. In this study, the ResNet18 model was optimized to improve the classification precision. Firstly, a Path Aggregation Network (PANet) was introduced to fuse the fine-grained features of the corn images. Secondly, a Reinforcement & Complementary Network (RCNet) was constructed to extract the local and edge features. Finally, a Collaborative Attention Feature Fusion (CAFF) structure was introduced to adaptively weight and fuse the features extracted by RCNet. As such, the model's attention was enhanced to the overall features. Experimental results demonstrate that the improved ResNet18 achieved the remarkable performance: the accuracy of 98.78%, precision of 96.62%, recall of 99.17%, and F1-score of 97.88%, which were 4.28, 4.11, 4.29, and 4.20 percentage points higher than those of the original ResNet18, respectively. Its inference speed reached 104 frames/s, and the model size was 105.2 MB, indicating a favorable balance between classification precision and computational efficiency. Ablation experiments further clarified the contribution rates of each module: the PANet was to enhance the bidirectional feature propagation between high-level semantic information and low-level details (such as seed surface textures and edge contours). The accuracy increased by 1.11 percentage points, thus proving its effectiveness in the multi-scale feature fusion. RCNet was to highlight the critical local features (e.g., unique hilum structures) in the Key Region Mask (KRM) module. While the complementary edge information was captured for the significant improvements—accuracy, precision, recall, and F1-score rose by 3.32, 3.35, 2.51, and 2.92 percentage points, respectively. CAFF adaptively weighted the features from RCNet to suppress the irrelevant noise. An additional 0.48 percentage points of the accuracy was enhanced to combine with the RCNet. The synergistic effect of these three modules was used to explain why the integrated model outperformed the single-module. Comparative experiments with the rest models validated their superiority: ResNet34 (94.60%) and ResNet50 (94.87%) showed lower accuracy, due to their inadequate focus on the subtle feature differences, despite deeper architectures. A lightweight model, EfficientNet-v2 was achieved 95.19% accuracy, but lagged in distinguishing visually similar varieties. WS-BAN (95.82%) and DBTNet (96.31%) were designed for the fine-grained tasks, indicating higher accuracy than ResNet variants but suffered from the slower inference (92 and 41 frames/s) and larger sizes (179.0 and 117.5 MB). In contrast, the improved ResNet18 significantly reduced the misclassification rates for the highly similar varieties: TC705 and Y348 shared the 31 and 40 misclassifications in the original ResNet18, as the reductions were to 4 and 16, with the drop rates of 87.1% and 60%, respectively. The strong performance was then achieved to capture the fine-grained features. Mobile deployment tests verified its practical value. The model maintained the high accuracy above 95% (99%, 97%, and 96%, respectively) in the complex backgrounds (palm for GS6H, green plastic for TC705, wooden table for Y348). The minimum inference speed was 257 ms, thus fully meeting the real-time requirements even on mobile devices with limited hardware. The stability of the model was attributed to the robust feature extraction under varying lighting and background interference. The on-site identification was realized without relying on professional equipment, thus facilitating the rapid, on-field seed variety verification. The improved ResNet18 effectively solved the challenge of classification on the fine-grained corn seeds. The finding can also provide a reliable technical reference for the intelligent seed identification in agricultural production.