高级检索+

基于SHAP特征优选和GA-RF的冬小麦早期种植分布精准提取

Accurate early-season extraction of winter wheat distribution based on SHAP feature selection and GA-RF

  • 摘要: 冬小麦是主要的粮食作物,早期精准提取其种植面积和空间分布信息对农业政策制定和产量预测具有重要意义。然而,当前研究主要采用某个生育期或全生育期的遥感影像进行冬小麦种植分布提取,早期种植分布提取的研究较为薄弱。为实现冬小麦早期精准识别,该研究提出了一种基于沙普利可加性解释(shapley additive explanations, SHAP)特征优选和遗传算法(genetic algorithm, GA)优化随机森林(random forest, RF)超参数的冬小麦早期精准提取方法。以焦作市为研究区,利用2021年Landsat-8、Sentinel-1和Sentinel-2多源遥感数据,基于冬小麦的典型生育期将其划分为8个时间阶段,分析不同阶段的冬小麦识别精度,并确定最早可提取的时间节点。结果表明:(1)与随机森林算法相比,优化后的算法在多个阶段(苗期、分蘖期、越冬期、拔节期、收获期)的分类精度明显提升,增幅为0.92~3.02个百分点;(2)阶段1(苗期)的分类精度达92.41%,与官方统计面积的相对误差为16.69%,提取面积与统计面积的线性回归决定系数( R^2 )为0.907;阶段2(分蘖期)的分类精度提升至96.51%,相对误差降至8.81%, 提取面积与统计面积的线性回归决定系数( R^2 )提高至0.955;(3)对比早期(阶段1、阶段2)和最佳阶段(阶段5,拔节期)的分类精度以及空间分布发现,冬小麦最早可在11月下旬(收获前约7个月)初步识别其空间分布框架,在12月下旬(收获前约6个月)可获取较为精确的空间分布和面积。(4)河南省其余17个市的独立验证精度均超过92%,提取面积与统计面积的线性回归决定系数( R^2 )为0.973,证明了该模型的高可靠性和空间泛化能力。所提方法提升了冬小麦早期识别的时效性,促进主动决策和资源配置,为产量估算、病虫害预警和减灾等快速响应提供依据。

     

    Abstract: Winter wheat is one of the most important staple food crops in China. It is of great significance to accurately and rapidly extract its planting area and spatial distribution at an early stage. Most current studies have focused mainly on the remote sensing images from a specific growth stage or the entire growth cycle. However, it is relatively scarce in the mapping of the early winter wheat. This study aims to accurately extract the spatial distribution of winter wheat at an early stage, particularly for decision-making and yield prediction. Feature selection was combined to extract the winter wheat using shapley additive explanations (SHAP) and hyperparameter optimization of a random forest (RF) model with a agenetic algorithm (GA). The study area was also taken as Jiaozuo City, Henan Province, China. Spectral, texture, polarization, and vegetation index features were constructed using multi-source remote sensing data (Landsat-8, Sentinel-1, and Sentinel-2) from 2021. According to the typical growth stages of the winter wheat, the growing season was divided into eight temporal stages. Features of each stage were ranked and then selected using SHAP values. Key RF hyperparameters (number of decision trees, number of features considered for splitting, and minimum samples per leaf node) were optimized using the GA. A comparison was then made on the classification accuracy, visual classification effectiveness, and extracted area versus statistical area across different stages. The earliest feasible time node was determined for the accurate extraction. Finally, the generalization of the model was validated in the entire Henan Province. The results showed that: (1) The optimal algorithm significantly improved the classification accuracy in the multiple stages (seedling, tillering, overwintering, jointing, and maturing period), compared with the standard RF algorithm. The accuracy increased in the range from 0.92 to 3.02 percentage oints. Furthermore, the classification also exhibited the finer visual details. (2) In stage 1 (seedling period), the classification accuracy reached 92.41% with a relative error of 16.69%, compared with the official statistical area and an R2 of 0.907. While in the second stage (tillering period), the classification accuracy was improved to 96.51%, the relative error decreased to 8.81%, and the R2 increased to 0.955. (3) A comparison was made on the classification accuracy and spatial distribution between early extractions (stages 1, 2) and the optimal stage (stage 5). The spatial distribution of the winter wheat was then identified as early as late November (about 7 months before harvest). The more accurate spatial distribution was obtained by late December (about 6 months before harvest). (4) The generalization of the model was evaluated in the rest 17 cities in Henan Province. The independent accuracy exceeded 92% for these cities. The total extraction area in Henan Province was 55 182.0 km2, with an R2 of 0.973 and an RMSE of 384.49 km2, compared with the statistical area (56 907.4 km2). The high reliability and spatial generalization were achieved for the large-scale applications. The high-precision extraction of the winter wheat distribution was obtained during the tillering stage. The timeliness of the winter wheat mapping significantly promoted the decision-making and resource allocation. The finding can also provide a strong reference for the rapid response scenarios, such as the yield estimation, early warning of pests and diseases.

     

/

返回文章
返回