基于Soft Voting改进的STAA集成深度学习种子模式识别模型

李鸿强; 张栋; 张超; 张诗欣; 李民赞

doi:10.11975/j.issn.1002-6819.202411258

基于Soft Voting改进的STAA集成深度学习种子模式识别模型

Seed pattern recognition model based on improved Soft Voting STAA ensemble deep learning

摘要

摘要: 为了解决谷子种子因表型细微差异导致的识别精度低问题，该研究以VGG16_bn、Resnet50、MobileNet_V2与GoogleNet模型为基础，构建Soft Voting集成学习模型，该模型针对8个谷子品种的平均准确率为95.52%，较最优子模型Resnet50的识别准确率提升了12.76个百分点。为进一步提升集成模型能力，对投票机制进行优化，提出了STAA增强框架：首先，采用TS2-stack动态权重分配算法，基于子模型准确率的非线性映射\tan \left(\frac\pi x_i^2x\right) 强化高性能模型贡献，较于初始深度学习集成模型识别准确率提升了2.00个百分点，然后，引入自适应惩罚机制，动态抑制对特定品种低效子模型的权重影响，联合TS2-stack使准确率较于只引入TS2-stack动态权重分配算法，又提升了0.50个百分点，最后，设计顺序软增强选择模块与自适应参数更新模块，通过多模型预测协同优化（70%/30%迭代权重）提升决策平滑性，较于引入自适应惩罚机制和TS2-stack动态权重分配算法，又提升了0.47个百分点。最终，模型识别准确率达98.49%。试验表明，STAA框架通过动态权重分配、噪声抑制与协同优化，明显提升复杂表型种子的鉴别能力。该试验为种子图像识别提供了参考。

Abstract: Precise identification of the millet seeds has remained one of the key challenges in agricultural science. The traditional morphological and physiological detection has been confined to the high interspecies similarity and subtle phenotypic differences. Manual identification has also led to low detection throughput (10-20 samples/hour) and high subjective bias (error rate > 15%). Although machine learning has improved the classification accuracy, the existing deep learning models still struggle with the complex seed patterns and the minimal visual differences, such as the eight varieties (average intra-class variance in RGB color space < 0.08). In this study, a deep learning framework of the progressively enhanced ensemble, STPA (Soft Voting-TS2-stack-Penalty-Adaptive sequential enhancement module) was proposed to detect the seed pattern. This architecture was innovatively integrated with four mainstream base classifiers: VGG16_bn, ResNet50, MobileNet_V2, and GoogLeNet. Initial experiments were carried out on a sigmoid linear-weighted soft voting mechanism. An accuracy of 95.52% was achieved on a dataset of 4,791 seed images, indicating a 9.79% improvement over the best-performing submodel (ResNet50). The better performance of the baseline was highlighted compared with the different network architectures. The MobileNet_V2 submodel also excelled in the texture discrimination (with a recall rate of 99.17% for the Zhongza 85 variety). While the VGG submodel achieved only 35.83% accuracy for the same task. The GoogLeNet submodel effectively captured the global shape features (with 99.58% accuracy for the Zhonggu 303 variety). Furthermore, three modules were proposed to optimize the STPA framework. Firstly, the nonlinear TS2-stack dynamic weighting was used to adjust the voting weights according to the submodel accuracy. There was an increase in the weight of high-performance models (e.g., ResNet50 weight was raised to 38.5%). While there was a decrease in the weight of the lower-performing models (e.g., MobileNet_V2 weight was reduced to 10.3%). The TS2-stack improved the accuracy by 2.00 percentage (from 95.52% to 97.52%), compared with the initial sigmoid linear weighting. The misclassification rate of easily confused varieties was reduced, such as Jinmiao K4, from 8.33% to 2.08%. Secondly, a real-time adaptive penalty mechanism was employed to monitor the submodel performance in a variety. An initial penalty coefficient was imposed on the consistently misclassified varieties. The overall accuracy was further reached 98.05%, which was adaptively updated using historical accuracy. Finally, a sequential soft enhancement module was utilized to iteratively fuse the current model predictions (70%) with the output of the next model (30%). The adaptive module was updated to reduce the prediction variance by 27.5%, with a final accuracy of 98.49%. A series of experiments was carried out to validate the superiority of the STPA framework. The best performance was achieved in the 98.49% accuracy within 80 training epochs, which was surpassed by ConvNeXt-Tiny (released in 2023) by 2.37 percentage points, and the latest state-of-the-art single model, AgriVision-Lite (released in 2024) by 1.27 percentage points at 200 epochs. A multi-stage strategy was recommended: The soft voting mechanism was used for the architectural diversity; The TS2-stack weighting was used to enhance the decision reliability; The penalty mechanism was used to suppress the class-specific biases; And the sequential enhancement was used to minimize the prediction uncertainty.

HTML全文

参考文献(28)

施引文献

资源附件(0)