高级检索+

考虑空间异质性的区域地下水位预测模型构建

Constructing regional groundwater level prediction models considering spatial heterogeneity

  • 摘要: 地下水位预测在干旱半干旱地区的农业生产与水资源管理中至关重要,但受到气象条件、地质结构与人类活动的非线性及空间异质性的影响,精准预测存在挑战。该研究提出了一种结合长短期记忆网络(long short-term memory, LSTM)与优化算法的区域地下水位预测框架。首先,基于皮尔逊相关性分析对不同观测点的解释变量进行筛选,确保输入变量的有效性。随后,引入樽海鞘算法(salp swarm algorithm, SSA)与贝叶斯优化(bayesian optimization, BO)对LSTM超参数进行优化,构建LSTM-SSA与LSTM-BO两类改进模型。该研究使用平均绝对误差(mean absolute error, MAE)、均方根误差(root mean square error, RMSE)和纳什系数(nash–sutcliffe efficiency, NSE)评估不同模型的性能。结果表明,基准LSTM模型在所有观测点上的平均MAE、RMSE和NSE分别为0.15、0.24和0.71。引入优化算法后,模型的预测精度有所提升,LSTM-SSA模型的平均MAE和RMSE分别降低至0.14和0.22,较基准模型分别下降约6.67%和8.33%,平均NSE提升至0.72。LSTM-BO模型表现最优,其平均MAE和RMSE进一步降低至0.13和0.20,较基准模型分别下降约13.33%和16.67%,平均NSE大幅提升至0.85。综上所述,LSTM-BO 模型通过其优化策略,提升了预测精度与鲁棒性。研究结果表明,结合相关性筛选与智能优化算法的LSTM模型能够有效应对地下水位预测中的空间异质性与非线性问题。该研究为干旱半干旱地区的地下水资源管理和农业排水工程规划设计提供了科学依据和实践参考。

     

    Abstract: Groundwater table prediction plays an essential role in agricultural production, drainage engineering, and water-resources regulation in arid and semi-arid regions, where shallow groundwater dynamics respond quickly to climate forcing and anthropogenic disturbances. The study focused on twenty-three daily monitoring stations located in Linwei District, Weinan City, Shaanxi Province, a typical semi-arid region where groundwater dynamics are governed by complex non-linear interactions between meteorological forcing and human activities. A regional groundwater level prediction framework was developed and evaluated by integrating site-specific explanatory variable selection with Long Short-Term Memory (LSTM) networks and intelligent hyperparameter optimization. To address the significant spatial heterogeneity across the study area, an independent modeling strategy was adopted, wherein a customized prediction model was constructed for each individual observation well. This approach allowed the framework to implicitly capture localized hydrogeological characteristics and varying responses to external stressors without the need for explicit spatial parameterization. The methodology commenced with a rigorous data preprocessing stage. A standard score-based outlier detection method was applied to the daily water level series, utilizing the 3 \sigma criterion in statistics deviations to identify and remove anomalous data points. Missing values resulting from this process were subsequently filled using linear interpolation. To ensure the effectiveness of model inputs, a comprehensive feature screening process was conducted using Pearson correlation analysis. Eight candidate variables, including meteorological and environmental indicators such as reference crop evapotranspiration, daily maximum and minimum temperatures, precipitation, relative humidity, sunshine hours, wind speed, and soil heat flux, were evaluated. Only those factors demonstrating a correlation coefficient absolute value greater than 0.30 with the localized groundwater level were selected as input features. This site-adaptive input selection effectively reduced data redundancy and mitigated the risk of overfitting by excluding weakly correlated noise. To overcome the limitations of manual hyperparameter tuning, two distinct intelligent optimization strategies were implemented and compared: the Salp Swarm Algorithm (SSA) and Bayesian Optimization (BO). The SSA simulated the swarming behavior of salp chains to explore the parameter space through a leader-follower mechanism. In contrast, BO employed a Gaussian Process as a surrogate model to estimate the distribution of the objective function and utilized an Upper Confidence Bound acquisition function to balance exploration and exploitation. These optimizers were used to automatically determine the optimal configuration of the model structures. The results demonstrated that intelligent optimization significantly enhanced the predictive accuracy and robustness of the baseline LSTM models. The benchmark LSTM models achieved average values for Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Nash-Sutcliffe Efficiency (NSE) of 0.15, 0.24, and 0.71, respectively. The introduction of the SSA reduced the average MAE and RMSE to 0.14 and 0.22. The LSTM network coupled with BO (LSTM-BO) exhibited the most superior performance, with the average MAE and RMSE further decreasing to 0.13 and 0.20. Crucially, the average NSE for the LSTM-BO models reached 0.85, and all individual stations maintained an efficiency score above 0.61. This indicated that BO provided a more stable and reliable parameter search than the SSA, particularly in complex stations where the baseline models previously failed or showed significant performance degradation. The findings confirmed that the proposed framework, by combining site-specific feature selection with probabilistic hyperparameter optimization, offered a powerful and practical tool for groundwater resource management and agricultural drainage engineering in semi-arid environments.

     

/

返回文章
返回