高级检索+

基于SHAP可解释特征优选的撂荒耕地遥感提取

Extracting abandoned farmland from remote sensing images using SHAP-interpretable feature optimization

  • 摘要: 耕地是粮食生产的重要组成部分,高效、准确地提取撂荒耕地对土地管理和农业规划具有重要意义。该研究以重庆奉节县平安乡为研究区,基于国产高分卫星遥感影像,提取了光谱、光谱指数、纹理、邻域及地形共34个特征,提出了一种Spearman相关系数和SHAP(Shapley Additive exPlanations)相结合的特征优选方法,探讨了6种特征组合方案和6种分类算法的适用性,最终通过分类后比较法提取撂荒耕地。结果表明:1)在光谱和光谱指数特征的基础上,地形、纹理及邻域特征的加入均能有效提升分类精度,其中地形特征贡献最大。总体而言,优选特征主要包括坡度、红波段、绿波段、蓝波段和差值植被指数。2)所有算法在方案六(Spearman+SHAP)中表现最佳,2019年和2020年的平均OA分别为90.24%和95.83%。其中, CatBoost算法的分类精度最高,2019年的OA和Kappa系数分别为92.14%和87.37%,2020年的OA和Kappa系数分别为97.86%和96.59%。3)研究区2019—2020年的撂荒耕地面积为3.55km2,撂荒率为16.94%,撂荒识别精度达92.71%,面积重叠率达83.76%。综上,该文提出的方法在撂荒耕地提取中表现出较高的准确性和适用性,可为后续撂荒监测提供一定的技术支持和参考。

     

    Abstract: Cultivated land can play a vital role in food production and a sustainable ecosystem, in order to maintain the ecological balance and food security. However, numerous cultivated lands have been abandoned in recent years, due to rapid urbanization and industrialization. Efficient and accurate identification of abandoned farmland is crucial to maintaining food security and efficient land use, particularly in the region with high population density and limited agricultural space. This study aims to extract the abandoned farmland from the remote sensing images using SHAP (Shapley Additive exPlanations)-interpretable feature optimization. The study area was taken as Ping’an Town in Fengjie County, Chongqing, China. The sloping farmland was concentrated and susceptible to abandonment, due to the challenging planting conditions on steep slopes. The data source consisted of 2-meter GF satellite images covering the study area, with the imaging time in 2019 and 2020. Then, 34 features were obtained after preprocessing, including the spectral, spectral index, texture, neighborhood, and topographic features. Spearman correlation coefficients and SHAP were also combined to optimize and then remove the redundant features. Six combinations of features were classified and evaluated after optimization. The best combination was selected for the classification of land cover. Finally, a spatial distribution pattern of abandoned farmland was generated after classification and comparison. The result indicated: 1) The topographic, texture, and neighborhood features were effectively improved with the high accuracy of classification, according to the spectral and spectral index features. The topographic features contributed the most. The most representative features were effectively optimized for the classification tasks. Thereby, the information redundancy and overfitting were avoided to improve the accuracy and efficiency of the model. Spearman correlation coefficients were used to filter out the high-correlation features. While the SHAP provided the high-precise feature importance analysis. The removal of weak classification features was realized after combination. The overfitting was also reduced to ensure that only the most representative features were used to improve the computational efficiency and stability. Overall, the features were selected mainly as the slope, red band, green band, blue band, and difference vegetation index. 2) Among the 6 schemes, all algorithms performed best under Scheme 6 (Spearman + SHAP), with an average overall accuracy (OA) of 90.24% in 2019 and 95.83% in 2020. The multiple classifications greatly contributed to the accuracy. The classification was compared with the optimal combination of Scheme 6. Ultimately, the CatBoost achieved the highest accuracy, with an OA of 92.14% and a Kappa of 87.37% in 2019, and an OA of 97.86% and Kappa of 96.59% in 2020. This superior performance of CatBoost was attributed to the categorical feature encoding, ordered boosting mechanism, and symmetric tree structure. The information loss and data leakage were reduced the overfitting for better adaptability to the complex datasets. 3) The optimal classification was overlaid to determine the abandoned farmland area of 3.55km2, with an abandonment rate of 16.94%. Visual interpretation and field surveys showed that the location accuracy of abandonment farmland was achieved at 92.71% after accuracy verification, while the area overlap rate was 83.76%. The smaller patches were measured between 400 and 800 m2, where the overlap rate was slightly lower at 82.68%. The abandoned farmland was extracted with high accuracy and low omission rates. The abandoned farmland was effectively fragmented and then verified the feasibility of the optimization. In summary, high accuracy and applicability were obtained to extract the abandoned farmland. The finding can offer valuable technical support and reference to monitor the abandoned farmland, according to the land cover extraction. This approach can also enhance understanding of the decision-making on land use in modern agriculture.

     

/

返回文章
返回