基于多模态特征对齐的作物病害叶片检测

周一帆; 刘东洋; 周宇平

doi:10.13733/j.jcam.issn.2095-5553.2024.07.027

基于多模态特征对齐的作物病害叶片检测

Detection of crop disease leaf based on multi-modal feature alignment

摘要

摘要: 针对现有农作物病害叶片检测方法利用图像特征定位叶片病害区域精度不高的问题，提出一种基于多模态特征对齐的作物病害叶片检测新方法。在训练阶段，利用视觉编码器和文本编码器将农作物叶片集中的图片和文本进行编码，并根据视觉编码特征定位给定图片中的病害区域，利用视觉和文本编码融合特征实现病害区域病害类型的细粒度分类。在推理阶段，利用预训练的病害区域定位模块定位给定测试图片中的病害区域，并将其提取的病害区域作为预训练分类模型的输入；通过计算预测文本值与文本集中原始标签之间的相似度值，快速给出病害区域的细粒度分类结果。在多个开源的农作物病害数据集上进行测试，所提出方法在马铃薯、番茄、苹果和草莓四种类型的病害叶片数据集上精准率分别为0.957 4、0.961 1、0.958 0和0.950 2,综合性能更优，具有较好实用价值。

Abstract: Aiming at the problem that the existing methods of crop disease leaf detection were not accurate enough to locate the leaf disease region by using image features, a new method of crop disease leaf detection based on multi-modal feature alignment was proposed. During the training phase, image and text from a collection of crop leaves were first encoded using visual and text encoders. The diseased areas in a given image were located according to the visual encoding features, and the integration of visual and text encoding features was used to achieve fine-grained classification of the type of disease in the diseased area. In the inference phase, the pretrained disease area localization module was used to locate the diseased areas in a given test image, and the extracted diseased areas were used as input for a pretrained classification model. Finally, by calculating the similarity between the predicted text values and the original labels in the text set, a rapid fine-grained classification result for the diseased area was obtained. Tests on several open-source crop disease datasets show that the proposed method can achieve high precision rates of 0.957 4, 0.961 1, 0.958 0, and 0.950 2 on potato, tomato, apple, and strawberry datasets, respectively. It has better comprehensive perfor mance and good paratical application value.

HTML全文

参考文献(21)

施引文献

资源附件(0)