基于多域特征融合增量反馈宽度学习的作物病害图像识别

苏涵; 戴曲顺; 李忠艳

doi:10.11975/j.issn.1002-6819.202504221

基于多域特征融合增量反馈宽度学习的作物病害图像识别

Crop disease image recognition based on multi-domain feature fusion and incremental feedback broad learning

摘要

摘要: 针对精准农业中基于无人机图像的作物病害识别技术面临的复杂环境适应性不足与模型动态更新成本高的问题，该研究提出一种轻量型高泛化的实时作物病害图像识别框架。首先，构建多域特征金字塔增量反馈宽度学习网络（multi-domain feature pyramid incremental feedback broad learning network，MFPIF-BLN）：设计多域特征金字网络模块（multi-domain feature pyramid module，MFPM），通过A-Shannon二维不可分离小波提取多尺度多域融合特征的层级特征图，结合跨层级多域特征拼接与通道注意力加权增强特征判别性和环境鲁棒性；然后，构建基于增量触发反馈机制（incremental trigger-feedback mechanism，ITFM）的宽度学习网络，通过动态误差监控与伪逆计算实现网络参数在线更新。在无人机采集的玉米异常图像数据集（ISVC_23 corn abnormality dataset，CAD）上进行消融试验，表明MFPM模块通过多域特征协同将准确率提升至92.53%，精度提升至94.37%，召回率提升至93.26%，F1得分提升至93.81%。对比试验表明，ITFM触发机制的宽度学习模型在不同增量情况下优于无触发机制模型：在增加增强节点时，ITFM-BLN实现93.97%的准确率、98.16%的精度和94.99%的F1得分；在增加特征节点时，ITFM-BLN实现92.31%的准确率、96.78%的精度和94.96%的F1得分。最后对比多种主流算法的综合性能，MFPIF-BLN网络以94.24%准确率、94.75%精度和95.18%F1得分以及0.0642s推理时间全面领先。此外，构建3组实际场景下的玉米叶片、海桐叶片和锦绣杜鹃叶片图像数据集，在实际场景中对比MFPIF-BLN与多种主流算法在病害识别任务中的综合性能，MFPIF-BLN网络在实际场景中依然具体有最优性能。所建框架具有较高的识别准确率及较强的泛化性，可为复杂农田病害识别提供高效解决方案。

Abstract: To address the dual challenges of insufficient environmental adaptability and high computational costs for model dynamic updates in UAV-image crop disease recognition for precision agriculture, we proposed a lightweight and highly generalized real-time recognition framework integrating multi-domain feature fusion and adaptive incremental learning. The core innovation lied in the multi-domain feature pyramid incremental feedback broad learning network (MFPIF-BLN), which synergized a multi-domain feature pyramid module (MFPM) and an incremental trigger-feedback mechanism (ITFM). MFPM employed a four-level A-Shannon non-separable 2D wavelet decomposition to overcome directional sensitivity limitations of traditional wavelets. At each decomposition level, low-frequency coefficients captured global structural and color patterns of disease regions, while high-frequency coefficients preserved localized edge and texture details. To enhance discriminative power, MFPM fused multi-domain features across scales: texture descriptors, HSV color histograms with illumination-robust quantization, geometric invariants with Hu moments, and multi-scale edge curvature statistics. Cross-level feature integration was achieved through pyramidal aggregation, where higher-level semantic features were iteratively upsampled and combined with lower-level details via skip connections. A channel attention gate further optimized feature representation by adaptively weighting informative channels while suppressing noise. For dynamic model adaptation, ITFM enhanced the broad learning system (BLS) framework. ITFM continuously monitored prediction errors and feature distribution shifts using dual criteria: batch-wise prediction error thresholds and cosine similarity analysis between new and historical features. When significant distribution drift was detected, the mechanism selectively expanded feature nodes through randomized mapping and generated corresponding enhancement nodes via nonlinear transformation. Unlike conventional BLS that randomly added nodes, ITFM employed a feedback-driven strategy to prioritize node expansions that maximally reduced current prediction errors, effectively curbing computational redundancy. Network parameters were updated through block-wise incremental pseudo-inverse calculations, avoiding full-model retraining while maintaining mathematical optimality. Comprehensive evaluation on the ISVC_23 corn abnormality dataset (12,354 UAV images) demonstrated the framework's superiority. Ablation experiments systematically validated the effectiveness of the MFPM module that the MFPM module enhanced the accuracy to 92.53%, precision to 94.37%, recall to 93.26% and F1-score to 93.81% respectively through multi-domain feature synergistic complementarity. Comparative experiments verified the superior incremental learning efficiency of ITFM mechanism that ITFM-BLN outperformed the traditional IBLS under varying incremental scenarios by incremental nodes redundancy reduction: When increasing feature-enhancing nodes, ITFM-BLN achieves optimal performance with accuracy to 93.97%, precision to 98.16%, and F1-score to 94.99%; when expanding feature nodes, ITFM-BLN delivers peak metrics of accuracy to 92.31%, precision to 96.78%, and F1-score to 94.96%. The advantage of incremental learning comparing with retraining was verified that MFPIF-BLN with incremental learning achieved superior accuracy (93.68% vs. 92.61%), precision (97.94% vs. 93.89%) and F1-score (94.91% vs. 93.84%) with 57.53 time faster training speed compared to globally retrained BLS when expanding from 2,000 to 10,000 enhancement nodes. Finally, the MFPIF-BLN framework achieved the optimal performance with accuracy to 94.24%, precision to 94.75% and F1-score to 95.18% by evaluating the comprehensive performance of multiple deep and broad learning network. In addition, three sets of corn, pittosporum tobira and rhododendron pulchrum leave image datasets in actual scenarios are constructed. The comprehensive performance of MFPIF-BLN and various mainstream algorithms in the disease identification task is compared in actual scenarios, where MFPIF-BLN network still has the optimal performance in three image datasets of actual scenarios. The framework’s lightweight design and adaptive incremental learning capability provide a practical solution for dynamic agricultural environments: MFPM's multi-scale multi-domain analysis mitigates environmental interference from illumination variance and occlusions, while ITFM's lightweight incremental updates enable continuous adaptation to seasonal and pathological variations without interrupting field operations. Future work will focus on adaptive threshold optimization, high-frequency noise resistance, and multi-domain data fusion to further enhance scalability.

HTML全文

参考文献(32)

施引文献

资源附件(0)