基于注意力特征融合与双分支上采样网络作物精细分类

刘垚; 李长春; 王欣; 焦英华; 相恒茂; 吴喜芳; 周龙飞; 徐乐; XULe

doi:10.11975/j.issn.1002-6819.202503046

基于注意力特征融合与双分支上采样网络作物精细分类

Fine classification of crops based on attention feature fusion and double-branch upsampling network

摘要

摘要: 为了在现有作物分类方法上进一步提高模型的分类精度与泛化能力，该研究构建了一种基于注意力特征融合与双分支上采样网络AF-DBUNet（attentional feature fusion and dual-branch upsampling network，AF-DBUNet），用于Sentinel-2影像的玉米和花生作物分类。AF-DBUNet采用注意力引导特征融合模块A-CFM（attention-guided cross-fusion module,A-CFM）优化编码器和解码器之间的特征融合，并通过双分支上采样融合模块减少信息丢失，增强空间特征重构与模型泛化能力。同时，结合Relief-F算法优选3个关键特征，优化模型的输入。试验结果表明，在玉米与花生分类任务中AF-DBUNet的性能优于U-Net、PSPNet和DeepLabv3+，平均交并比为85.17%，总体精度为92.30%；在跨县综合泛化测试中仍表现最优，平均交并比和总体精度分别达到81.18%、88.89%；此外，在跨市域跨年份泛化测试中模型的总体精度达到80.42%，表现出较好的鲁棒性与泛化能力。该模型为精准农业中作物分类提供了有效参考。

Abstract: Accurate crop classification played a crucial role in modern precision agriculture, enabling more efficient resource management, yield estimation, and decision-making. By accurately identifying crop types, farmers and agricultural organizations could optimize irrigation, fertilization, and pest control strategies, ultimately enhancing productivity and sustainability. To further improve the accuracy and generalization ability of crop classification, this study proposed an advanced deep semantic segmentation model, AF-DBUNet (Attention Feature Fusion and Dual-Branch Upsampling Network,AF-DBUNet), which utilized Sentinel-2 satellite images to achieve high-precision classification of corn and peanut crops and verified its applicability in large-scale agricultural monitoring.This study took Pingyu County and Runan County of Zhumadian City, Henan Province, and Tanghe County of Nanyang City as experimental areas and constructed a high-precision crop label dataset integrating multi-temporal Sentinel-2 L2A-level images (10m resolution) and RTK measured data. To ensure the consistency of data quality, SNAP 10.0 software was used for image preprocessing and resampling, and crop distribution labels with precise spatial positioning were generated through ArcMap. Each crop was assigned a specific color code to assist in precise labeling, enabling the model to learn accurate spatial and spectral features.To improve classification performance, the researchers conducted feature selection using the Relief-F algorithm. Initially, ten spectral features were extracted from the original Sentinel-2 imagery, including key vegetation indices such as the NIR (near-infrared, NIR), NDVI (normalized difference vegetation index, NDVI), RVI (ratio vegetation index, RVI), and EVI (enhanced vegetation index, EVI). The Relief-F algorithm was then applied to rank these features based on their contribution to classification performance, selecting the top three most informative features. This process effectively reduced redundant spectral information while enhancing the model's ability to distinguish between different crop types.Additionally, to further improve the model's generalization ability and prevent overfitting, extensive data augmentation techniques were applied to both satellite images and corresponding labels. These included horizontal flipping, vertical flipping, diagonal mirroring, and Gaussian blur, ensuring the model was exposed to diverse spatial variations during training.AF-DBUNet introduced two core components: the A-CFM (attention-guided cross-fusion module, A-CFM) and the Dual-Branch Upsampling Fusion Module. The model adopted an encoder-decoder architecture, with the encoder based on ResNet50 and the global average pooling layer and fully connected layer removed to enhance deep feature extraction. The A-CFM module enhanced multi-scale feature fusion through residual connections and attention mechanisms, ensuring that key crop areas were accurately classified; the dual-branch upsampling fusion module combined bilinear interpolation and transposed convolution techniques to optimize the spatial feature reconstruction process.The model used an improved ResNet50 as the encoder, combined the Dice Loss + Focal Loss hybrid loss function and cosine annealing learning rate scheduling strategy, and was implemented in the PyTorch framework for end-to-end optimization, effectively alleviating model bias caused by sample imbalance.Experimental results showed that in the model training area test, AF-DBUNet significantly outperformed PSPNet, DeepLabv3+, and U-Net models in all indicators: the mPA (mean pixel accuracy, mPA) reached 92.13%, which was 5.65, 2.75, and 2.92 percentage points higher than PSPNet, U-Net, and DeepLabv3+, respectively; the mIoU (mean Intersection over Union, mIoU) was 85.17%, which was 8.41, 3.15, and 4.03 percentage points higher than other models. Additionally, the OA (overall accuracy, OA) of AF-DBUNet was 92.30%, which was 2.42 to 4.74 percentage points higher than other models.In terms of misclassification and omission of peanut and corn crops, AF-DBUNet achieved the highest UA (user accuracy, UA) and PA (producer accuracy, PA) in all categories, enabling more accurate identification of target crops.In the cross-county independent test area generalization evaluation, AF-DBUNet achieved the highest comprehensive generalization performance among the four test areas, with mIoU of 81.18%, mPA of 89.16%, and OA of 88.89%. The UA and PA of peanuts were 87.85% and 90.50%, respectively, while those of corn were 87.59% and 88.07%, respectively. This demonstrated the relatively stable generalization ability of AF-DBUNet.In the cross-city and cross-year independent test area generalization evaluation (2023 Tanghe County data), the overall accuracy of AF-DBUNet remained stable at 80.42%, fully verifying its excellent generalization performance.In summary, AF-DBUNet further effectively improved the accuracy and generalization ability of crop classification through the collaborative optimization of attention-guided feature fusion and dual-branch upsampling fusion modules. Its high accuracy (OA > 92%) and strong generalization (cross-region OA > 80%) provided a reliable tool for large-scale agricultural remote sensing monitoring.

HTML全文

参考文献(35)

施引文献

资源附件(0)