基于多域特征融合增量反馈宽度学习的作物病害图像识别

苏涵; 戴曲顺; 李忠艳

doi:10.11975/j.issn.1002-6819.202504221

基于多域特征融合增量反馈宽度学习的作物病害图像识别

Crop disease image recognition based on multi-domain feature fusion and incremental feedback broad learning

摘要

摘要: 针对精准农业中基于无人机图像的作物病害识别技术面临的复杂环境适应性不足与模型动态更新成本高的问题，该研究提出一种轻量型高泛化的实时作物病害图像识别框架。首先，构建多域特征金字塔增量反馈宽度学习网络（multi-domain feature pyramid incremental feedback broad learning network，MFPIF-BLN）：设计多域特征金字网络模块（multi-domain feature pyramid module，MFPM），通过A-Shannon二维不可分离小波提取多尺度多域融合特征的层级特征图，结合跨层级多域特征拼接与通道注意力加权增强特征判别性和环境鲁棒性；然后，构建基于增量触发反馈机制（incremental trigger-feedback mechanism，ITFM）的宽度学习网络，通过动态误差监控与伪逆计算实现网络参数在线更新。在无人机采集的玉米异常图像数据集（ISVC_23 corn abnormality dataset，CAD）上进行消融试验，结果表明，MFPM模块通过多域特征协同将准确率提升至92.53%，精度提升至94.37%，召回率提升至93.26%，F1得分提升至93.81%。对比试验表明，ITFM触发机制的宽度学习模型在不同增量情况下优于无触发机制模型：在增加增强节点时，ITFM-BLN实现93.97%的准确率、98.16%的精度和94.99%的F1得分；在增加特征节点时，ITFM-BLN实现92.31%的准确率、96.78%的精度和94.96%的F1得分。最后对比多种主流算法的综合性能，MFPIF-BLN网络以94.24%的准确率、94.75%的精度和95.18%的F1得分以及0.0642 s推理时间全面领先。此外，构建3组实际场景下的玉米叶片、海桐叶片和锦绣杜鹃叶片图像数据集，在实际场景中对比MFPIF-BLN与多种主流算法在病害识别任务中的综合性能，MFPIF-BLN网络在实际场景中依然具体有最优性能。所建框架具有较高的识别准确率及较强的泛化性，可为复杂农田病害识别提供高效解决方案。

Abstract: Crop disease recognition has been limited by the insufficient environmental adaptability and high computational costs in precision agriculture. It is often required for the model's dynamic updates using UAV images. In this study, a lightweight and highly generalized real-time recognition framework was proposed to integrate the multi-domain feature fusion and adaptive incremental learning. The multi-domain feature pyramid incremental feedback broad learning network (MFPIF-BLN), was also synergized with a multi-domain feature pyramid module (MFPM) and an incremental trigger-feedback mechanism (ITFM). A four-level Shannon non-separable 2D wavelet decomposition was employed to overcome the directional sensitivity limitations of the conventional wavelets. At each decomposition level, the low-frequency coefficients captured the global structural and color patterns from the disease regions. The high-frequency coefficients preserved the localized edge and texture details. Furthermore, the MFPM also fused the multi-domain features to enhance the discriminative power at different scales, including the texture descriptors (rotation-invariant LBP and multi-directional GLCM), HSV color histograms with the illumination-robust quantization, geometric invariants with Hu moments, and multi-scale edge curvature statistics. Cross-level feature was integrated after pyramidal aggregation, where the higher-level semantic features were iteratively upsampled and combined with lower-level details via skip connections. A channel attention gate was further optimized to improve the feature representation, in order to adaptively weight the informative channels for noise suppression. In dynamic model adaptation, the ITFM enhanced the broad learning system (BLS) framework. The prediction errors and feature distribution were continuously monitored, according to the dual criteria: batch-wise prediction error thresholds and cosine similarity analysis between new and historical features. Once the significant distribution drift was detected, the attention mechanism was selectively expanded to the feature nodes through randomized mapping and then generated the enhancement nodes via a nonlinear transformation. Compared with the conventional BLS that randomly added nodes, a feedback-driven strategy was employed to prioritize the node expansions that maximally reduced current prediction errors, effectively curbing computational redundancy. Network parameters were updated after block-wise incremental pseudo-inverse calculations, thus avoiding the full-model retraining for the mathematical optimality. A comprehensive evaluation was performed on the ISVC_23 Corn Abnormality Dataset (12 354 UAV images). Ablation experiments also validated that the MFPM module enhanced the accuracy to 92.53%, precision to 94.37%, recall to 93.26% and F1-score to 93.81% respectively, after multi-domain feature synergistic complementarity. Comparative experiments verified that there was a superior incremental learning efficiency of the ITFM mechanism. The ITFM-BLN outperformed the conventional IBLS under varying incremental scenarios after incremental node redundancy reduction. When increasing feature-enhancing nodes, the ITFM-BLN achieved the optimal performance with an accuracy of 93.97%, precision of 98.16%, and F1-score of 94.99%; when expanding feature nodes, the ITFM-BLN delivered the peak metrics of accuracy to 92.31%, precision to 96.78%, and F1-score to 94.96%. The incremental learning was compared with retraining. The results verified that the MFPIF-BLN with incremental learning achieved superior accuracy (93.68% vs. 92.61%), precision (97.94% vs. 93.89%), and F1-score (94.91% vs. 93.84%) with 57.53 times faster training speed, compared with the globally retrained BLS, when expanding from 2 000 to 10 000 enhancement nodes. Finally, the MFPIF-BLN framework achieved the optimal performance with an accuracy of 94.24%, precision of 94.75% and F1-score of 95.18%, indicating the better performance of the multiple deep and broad learning networks. In addition, three datasets were constructed for the corn, pittosporum tobira, and rhododendron pulchrum leaves in actual scenarios. A comparison was also made on the performance of MFPIF-BLN and various mainstream algorithms in the disease identification task in the actual scenarios. The MFPIF-BLN network still shared the optimal performance in three image datasets of actual scenarios. The lightweight and adaptive incremental learning of the framework can provide a practical solution to the dynamic agricultural environments. MFPM's multi-scale multi-domain analysis can also mitigate the environmental interference from the illumination variance and occlusions. While the ITFM's lightweight increment can adapt to the seasonal and pathological variations without any field operations. The large-scale scalability can be enhanced to optimize the adaptive threshold, high-frequency noise resistance, and multi-domain data fusion.

HTML全文

参考文献(32)

施引文献

资源附件(0)