高级检索+

基于Stacking集成学习的土壤侵蚀速率计算与主导因子分析——以三峡库区奉节县为例

Soil erosion rate calculation based on Stacking ensemble learning and leading factor analysis: A case study of Fengjie county in the Three Gorges Reservoir Area

  • 摘要: 土壤侵蚀速率的计算是水土保持工作的关键之一。为提高计算精度,引入Stacking集成方法,利用其能充分融合不同机器学习模型的特点,获取高精度的土壤侵蚀速率空间分布数据并分析影响研究区土壤侵蚀速率的主导因子。基于重庆市奉节县三峡库区2018年降雨量、遥感影像等数据构建特征集,以奉节县土壤侵蚀速率真实数据作为基准,通过训练不同机器学习模型,使用精度评价指标和多样性度量来建立最优的基学习器和元学习器组合,构建Stacking模型并获取土壤侵蚀速率空间分布图,然后针对土壤侵蚀速率分布规律对其主导因子进行边际依赖性分析。结果表明:1)以轻型梯度提升机、随机森林为基学习器,线性回归器为元学习器的Stacking集成模型效果最优,平均绝对误差、均方根误差和决定系数的表现分别为252.48 t/(km2·a)、537.78 t/(km2·a)和0.868 7;2)高程、降雨量、植被覆盖、坡度、距道路距离和距水源距离对奉节县土壤侵蚀速率影响程度排序位于前6,重要性所占比例均超过9%;3)在高程200~520 m,年总降雨量高于1 250 mm,NDVI为0.24~0.27,坡度在26°~35°之间,距道路距离0~220 m,距水源地距离63~387 m的地区土壤侵蚀速率较高。综上,构建的Stacking模型能够有效融合不同模型优势,提升预测土壤侵蚀速率的精度;奉节县土壤侵蚀速率受多方面因素综合影响,总体上与高程、植被覆盖程度之间呈正相关关系,与降雨量、坡度之间呈负相关关系,较高速率的土壤侵蚀倾向于发生在降雨充沛、植被覆盖度低、距道路及水源较近的低海拔陡峭区域。

     

    Abstract:
    Background The calculation and assessment of soil erosion is the key to soil and water conservation. In order to improve the calculation accuracy, stacking ensemble method is introduced, which can fully integrate different machine learning models to obtain high-precision spatial distribution data of soil erosion rate. At the same time, the leading factors affecting the soil erosion rate in the study area were analyzed.
    Methods Firstly, the feature dataset was constructed based on the data of 2018 rainfall, remote sensing images and others in Fengjie county, Chongqing, and the actual data of soil erosion rate in Fengjie county was used as the benchmark to train different machine learning models. Then, the accuracy evaluation index and diversity measure were used to establish the optimal combination of base-learners and meta-learner, construct the stacking integrated model, and to calculate the soil erosion rate in the whole county. Finally, the marginal dependence of the leading factors was analyzed according to the distribution law of soil erosion rate.
    Results 1) The stacking ensemble model with light gradient boosting machineand random forest as the base-learners and linearregressionas the meta-learner has the best effect. The MAE(mean absolute error), RMSE (root mean square error) and accuracy of R2(R-squared) are as follows: 252.48 t/(km2·a), 537.78 t/(km2·a) and 0.868 7. 2) Elevation, rainfall, vegetation cover, slope, distance from the road and distance from water source were the top 6 factors influencing soil erosion rate in Fengjie county, with importance accounting for more than 9%. 3) Soil erosion rate was higher in the region with an elevation of 200-520 m, annual rainfall higher than 1 250 mm, NDVI (normalized difference vegetation index) of 0.24-0.27, slope of 26°-35°, distance from the road to 0-220 m, and distance from the water source to 63-387 m.
    Conclusions The results show that the stacking model constructed in this paper can effectively integrate different models and improve the accuracy of predicting soil erosion rate. Soil erosion rate in Fengjie county is affected by many factors.In general, soil erosion rate was positively correlated with elevation and vegetation cover degree, and negatively correlated with rainfall and slope.The higher rate of soil erosion tended to occur in steep low-elevation areas with abundant rainfall, low vegetation cover, and close proximity to roads and water sources.

     

/

返回文章
返回