Seed pattern recognition model based on improved Soft Voting STAA ensemble deep learning
-
Graphical Abstract
-
Abstract
Precise identification of the millet seeds has remained one of the key challenges in agricultural science. The traditional morphological and physiological detection has been confined to the high interspecies similarity and subtle phenotypic differences. Manual identification has also led to low detection throughput (10-20 samples/hour) and high subjective bias (error rate > 15%). Although machine learning has improved the classification accuracy, the existing deep learning models still struggle with the complex seed patterns and the minimal visual differences, such as the eight varieties (average intra-class variance in RGB color space < 0.08). In this study, a deep learning framework of the progressively enhanced ensemble, STPA (Soft Voting-TS2-stack-Penalty-Adaptive sequential enhancement module) was proposed to detect the seed pattern. This architecture was innovatively integrated with four mainstream base classifiers: VGG16_bn, ResNet50, MobileNet_V2, and GoogLeNet. Initial experiments were carried out on a sigmoid linear-weighted soft voting mechanism. An accuracy of 95.52% was achieved on a dataset of 4,791 seed images, indicating a 9.79% improvement over the best-performing submodel (ResNet50). The better performance of the baseline was highlighted compared with the different network architectures. The MobileNet_V2 submodel also excelled in the texture discrimination (with a recall rate of 99.17% for the Zhongza 85 variety). While the VGG submodel achieved only 35.83% accuracy for the same task. The GoogLeNet submodel effectively captured the global shape features (with 99.58% accuracy for the Zhonggu 303 variety). Furthermore, three modules were proposed to optimize the STPA framework. Firstly, the nonlinear TS2-stack dynamic weighting was used to adjust the voting weights according to the submodel accuracy. There was an increase in the weight of high-performance models (e.g., ResNet50 weight was raised to 38.5%). While there was a decrease in the weight of the lower-performing models (e.g., MobileNet_V2 weight was reduced to 10.3%). The TS2-stack improved the accuracy by 2% (from 95.52% to 97.52%), compared with the initial sigmoid linear weighting. The misclassification rate of easily confused varieties was reduced, such as Jinmiao K4, from 8.33% to 2.08%. Secondly, a real-time adaptive penalty mechanism was employed to monitor the submodel performance in a variety. An initial penalty coefficient was imposed on the consistently misclassified varieties. The overall accuracy was further reached 98.05%, which was adaptively updated using historical accuracy. Finally, a sequential soft enhancement module was utilized to iteratively fuse the current model predictions (70%) with the output of the next model (30%). The adaptive module was updated to reduce the prediction variance by 27.5%, with a final accuracy of 98.49%. A series of experiments was carried out to validate the superiority of the STPA framework. The best performance was achieved in the 98.49% accuracy within 80 training epochs, which was surpassed by ConvNeXt-Tiny (released in 2023) by 2.37 percentage points, and the latest state-of-the-art single model, AgriVision-Lite (released in 2024) by 1.27 percentage points at 200 epochs. A multi-stage strategy was recommended: The soft voting mechanism was used for the architectural diversity; The TS2-stack weighting was used to enhance the decision reliability; The penalty mechanism was used to suppress the class-specific biases; And the sequential enhancement was used to minimize the prediction uncertainty.
-
-