基于改进DeepLabv3+的马铃薯幼苗与杂草识别方法

祝诗平; 林曦; 冯川; 周杰; 李博鑫

doi:10.11975/j.issn.1002-6819.202410193

摘要: 针对自然环境下农作物与杂草相互交织，杂草种类繁多，难以准确识别等问题，该研究以马铃薯幼苗及其伴生杂草为研究对象，提出了一种改进DeepLabv3+模型的杂草识别方法。首先以DeepLabv3+语义分割模型为基准，将其主干网络替换为MobileNetV2，构成轻量化DeepLabv3+模型，随后为了提升模型的非线性能力，提出了一种基于注意力机制的激活函数（attention activate function，AAF），并将其融入到AAF-Conv卷积里，取代轻量化DeepLabv3+语义分割模型中主干网络MobileNetV2的第一个3×3Conv，建立AAF-DeepLabv3+模型。使用AAF-DeepLabv3+模型获取马铃薯幼苗的形态边界，采用图像学的方法识别图像中杂草区域。在轻量化DeepLabv3+模型基础上，AAF激活函数与常见激活函数进行对比试验，平均交并比（mean intersection over Union，mIoU）分别比ReLU6、SiLU、CeLU提升了1.58、1.31、1.99个百分点，平均像素识别准确率（mean pixel accuracy，mPA）分别提升了1.47、0.6、1.26个百分点，表现出良好的性能。AAF-DeepLabv3+模型在消融试验和与其他常见语义分割模型对比中，表现出了显著的性能优势，mIoU和mPA分别为90.82%和95.56%，比原始DeepLabv3+模型提升了1.07和1.15个百分点，帧率为69.21帧/s，比原始模型提高了30.77帧/s，模型大小为22.56 MB，比原始模型降低了185.96 MB。结果表明在同一试验环境下，该模型整体性能优于UNet、PSPNet、HrNet、DeepLabv3、FCN等主流的语义分割网络模型。该杂草识别方法不仅降低了前期图片标注工作量，还有效地解决了杂草与农作物目标交叠且杂草种类繁多带来的识别难题，为移动端设备进行农田杂草识别及研制智能化除草装置提供了技术参考。

Abstract: An accurate and rapid identification is often required for the intertwined crops and weeds in the natural environment. However, many kinds of weeds cannot be identified accurately in real time. In this study, weed identification was proposed using the AAF-DeepLabv3 model. The potato seedlings and their associated weeds were also taken as the research objects. Firstly, the backbone network was replaced with the MobileNetV2, according to the DeepLabv3 semantic segmentation. A lightweight DeepLabv3 model was established to improve the nonlinear regression of the model. An attention activation function (AAF) was proposed using an attention mechanism. The AAF-Conv convolution was also integrated to replace the lightweight DeepLabv3 in the semantic segmentation. The backbone network MobileNetV2 was the first 3×3 Conv. As such, the AAF-DeepLabv3 model was established after optimization. The AAFDeepLabv3 model was used to obtain the morphological boundaries of the potato seedlings. The weed areas were identified from the images using imaging techniques. The AAF activation function was then compared with the common ones, according to the lightweight DeepLabv3 model. The mean intersection over Union (mIoU) increased by 1.58, 1.31, and 1.99 percentage points, compared with the ReLU6, SiLU, and CeLU, respectively. While the mean pixel accuracy (mPA) increased by 1.47, 0.6, and 1.26 percentage points, respectively, indicating better performance. The AAF-DeepLabv3 model also shared significant performance advantages over the other common semantic segmentation. The mIoU and mPA of 90.82% and 95.56% were 1.07 and 1.15 percentage points higher than the original DeepLabv3 model, respectively. The frame rate was 69.21 frames/s, which was 30.77 frames/s higher than the original model. While the model size was 22.56 MB, which was 185.96 MB lower than the original model. The overall performance of the model was better than that of the mainstream network models in the semantic segmentation, such as the UNet, PSPNet, HrNet, DeepLabv3, and FCN. The semantic segmentation of the AAF-DeepLabv3 model was much more accurate to segmentate the potato seedling images. There was also the fine contour segmentation for the potato seedling image boundary. The AAF-DeepLabv3 model shared the excellent performance to accurately segment the potato seedlings with the small targets or the growing together with weeds. Finally, there was a decrease in the number of images that were annotated in the early stage after weed recognition, compared with object detection and ordinary semantic segmentation. The effective identification was also realized to reduce the overlap of the weeds and crop targets in the wide variety of weeds. This finding can provide a technical reference to develop mobile terminal equipment and intelligent weeding devices to identify the weeds in farmland.

基于改进DeepLabv3+的马铃薯幼苗与杂草识别方法

Potato seedling and weed identification based on improved DeepLabv3+