高级检索+

基于BO-XGBoost的三维特征空间潮土盐渍化估算方法

Estimation method for tidal soil salinization using a three-dimensional feature space based on BO-XGBoost

  • 摘要: 黄河三角洲地区盐渍化程度较为严重,相较于二维特征空间,三维特征空间能更充分地利用遥感影像的多波段信息,具有更强的监测能力。然而,目前基于三维特征空间,结合贝叶斯优化(Bayesian optimization,BO)的XGBoost (extreme gradient boosting)模型实现特征优选并构建三维特征空间以实现黄河三角洲地区盐渍化监测的研究相对较少。为此,该研究基于Landsat 9遥感影像,提取了涵盖植被指数、盐分指数、水体指数等光谱指数共计35个,利用贝叶斯优化后的XGBoost模型对不同类别指数进行特征重要性评估与优选,基于优选出的代表性指数,跨类别组合构建了多个三维特征空间模型,结合野外实测数据,通过精度评价指标对比分析,筛选得到黄河三角洲地区最优的三维特征空间盐渍化反演模型,并基于最优模型,对黄河三角洲地区盐渍化进行空间分析。结果表明:1)基于贝叶斯优化的XGBoost模型能够实现指数的筛选,盐分指数7(salinity index 7, SI7)的特征重要性最高,为0.341,且R2、RMSE及性能与四分位距的比率(ratio of performance to inter-quartile,RPIQ)分别达0.921、0.964和8.422,最终筛选出8个指数用于三维特征空间的构建;2)基于SI8-Albedo-WI所构建的特征空间监测模型精度最高,R2、RMSE和RPIQ分别为0.922、0.863 g/kg和7.645。相较于二维特征空间,三维特征空间能够充分利用光谱信息,最优模型R2、RPIQ分别升高0.059和1.191,RMSE降低0.069 g/kg,从而实现对土壤盐渍化的高精度预测;3)基于最优三维模型进行盐碱化土地分类,其分类Kappa系数可达86%,ERVI-WI-Albedo表现效果最差,R2、RMSE和RPIQ分别为0.519、3.464 g/kg和1.087;4)在黄河三角洲地区,中度盐渍化面积占比最高,为29.7%,分布在利津县、垦利区中西部等地区,重度盐渍化面积占比最低,为9.8%,主要分布在垦利区东部等地区。研究结果可为黄河三角洲土壤盐渍化防治与改良提供重要的决策与支撑。

     

    Abstract: Soil salinization has threatened the ecological stability and sustainable agriculture under the global environmental and climate change. Among them, the Yellow River Delta region can exhibit the seriously soil salinization and ecological heterogeneity with the complex features, compared with the arid areas. However, conventional two-dimensional feature spaces have limited to only two environmental variables under such complex environments. Furthermore, only a few specific indices cannot systematically construct and evaluate a index pool to identify the optimal parameters of the soil salinization. It is often required for the full extraction of the salinization information. In contrast, three-dimensional feature spaces can be expected to effectively utilize the multi-band remote sensing data and multiple environmental variables, indicating the high-accuracy monitoring. In this study, the three-dimensional feature spaces were constructed with the spectral indices using feature selection and Bayesian-optimized XGBoost models, in order to estimate the tidal soil salinization in the Yellow River Delta. Landsat 9 satellite imagery was employed to build a spectral index pool. A total of 35 spectral indices were extracted, including the vegetation, salinity and water indices. A XGBoost model with Bayesian optimization was utilized to evaluate the features for the modelling efficiency and parameter screening, according to the built-in Gain metrics. The top two most important indices were retained from each category. Multiple three-dimensional feature space models were constructed to combine the representative indices. The three coordinate axes were represented the different index types in the three-dimensional spaces. Any point (x, y, z) in the feature space was corresponded to the values of the three indices for a specific pixel in the remote sensing image. Simultaneously, the multiple two-dimensional feature spaces were built using the single most important index from each category. The accuracy evaluation metrics were compared with the field-measured data. The optimal three- and two-dimensional feature space models were determined for the soil salinization inversion in the Yellow River Delta. Regional salinization spatial analysis was then conducted after optimization. The results show that: 1) The XGBoost model with Bayesian optimization was effectively screened the most relevant indices. Salinity indices were achieved the highest modelling accuracy (R2 = 0.921, RMSE = 0.964 g/kg, and RPIQ = 8.422), with the salinity index 7 (SI7), indicating the highest feature importance (0.341). Ultimately, eight of the most informative feature indices were selected after comparison. 2) Three-dimensional feature spaces were more fully exploited the spectral information, compared with the two-dimensional feature spaces. The optimal three-dimensional model was improved 0.059 in R2 and 1.191 in RPIQ, with a reduction of 0.069 g/kg in RMSE, indicating that the three-dimensional approach was improved the high-precision prediction of the soil salinization. 3) Among the three-dimensional feature space models, the SI8-Albedo-WI was achieved in the highest accuracy (R2 = 0.922, RMSE = 0.863 g/kg, RPIQ = 7.645, and Kappa coefficient = 86%), whereas the ERVI-WI-Albedo model performed the worst (R2 = 0.519, RMSE = 3.464 g/kg, and RPIQ = 1.087). 4) The moderately salinized areas were accounted for the largest proportion (29.7%) in the Yellow River Delta region, primarily distributed in the central-western part of the Kenli District and Lijin County; The severely salinized areas were constituted the smallest proportion (9.8%), which were located mainly in the eastern part of Kenli District. The findings can also provide crucial references and decision-making support to prevent and remediate the soil salinization in the Yellow River Delta.

     

/

返回文章
返回