基于两阶段分解和可解释机器学习的中长期径流预测

薛联青; 周天文; 刘远洪; 杨丽娟

基于两阶段分解和可解释机器学习的中长期径流预测

Medium and Long-term Runoff Forecasting Based on Two-stage Decomposition and Interpretable Machine Learning

摘要

摘要: 为提高月径流预测精度，提出了基于两阶段分解策略的预测模型。首先通过基于加权回归的季节趋势分解（Seasonal-Trend Decomposition Using Loess, STL）将原始径流序列分解为趋势项、季节项和残差项，再通过变分模态分解（Variational Mode Decomposition, VMD）将随机性较强的残差项进一步分解以剔除噪声，利用长短期记忆网络（LongShort Term Memory, LSTM）、卷积神经网络（Convolutional Neural Network, CNN）和支持向量回归（Support Vector Regression, SVR）3种机器学习模型对各分量逐一预测，月径流预测结果为各分量预测值的线性集合。以澧水流域石门站为研究对象，选取平均绝对误差（Mean Absolute Error, MAE）、平均绝对百分比误差（Mean Absolute Percentage Error, MAPE）和纳什效率系数（Nash-Sutcliffe Efficiency Coefficient, NSE）等指标对模型预测精度进行综合评估，并结合SHAP(SHapley Additive exPlanations)可解释机器学习方法探究最优模型中各输入特征对径流预测结果的贡献程度。结果表明：LSTM和CNN模型预测精度整体优于SVR模型，但模型结构差异造成的预测精度变化小于输入项差异造成的变化；两阶段分解后各分量对预测结果的贡献优于其他输入特征，模型预测精度整体提升，且对高流量事件预测精度的提升尤为显著。

Abstract: To improve the prediction accuracy of monthly runoff, this paper proposes a prediction model based on a two-stage decomposition strategy. First, the original runoff sequence is decomposed into trend terms, seasonal terms and residual terms through the Seasonal-trend decomposition using Loess（STL）, and then the residual items with strong randomness are further decomposed through the Variational Mode Decomposition（VMD） to eliminate noise. Three machine learning models, namely, Long-Short Term Memory（LSTM） networks, Convolutional Neural Network（CNN） and Support Vector Regression（SVR）, are used to predict each component one by one, and the monthly runoff prediction result is a linear set of the predicted values of each component. Taking Shimen Station in Lishui Basin as the research object, indexes such as Mean Absolute Error（MAE）、Mean absolute percentage error（MAPE） and Nash Efficiency（NSE） were selected to comprehensively evaluate the model prediction accuracy, and combined with the SHAP（Shapley Additive explanations） interpretable machine learning method to explore the input characteristics in the optimal model contribution to runoff prediction results. The result shows that the overall prediction accuracy of LSTM and CNN is better than that of the SVR model, but the change in prediction accuracy caused by the difference in the model structure is smaller than that caused by the difference in input items; the contribution of each decomposition component in the optimal model STL-VMD-LSTM to the prediction results is better than that of other input characteristics.

HTML全文

参考文献(22)

施引文献

资源附件(0)