双向时序数据驱动的玻璃温室环境变量预测模型

贺琳; 李响; 杜继兵; 张新雨; 时国龙

doi:10.11975/j.issn.1002-6819.202507046

双向时序数据驱动的玻璃温室环境变量预测模型

Predicting environmental variables in a greenhouse using a bidirectional temporal data-driven model

摘要

摘要: 玻璃温室是现代化农业的重要组成部分。为了实现高效环境调控，对其环境信息进行精准预测尤为重要。针对传统模型在处理此类复杂时间序列数据时存在的噪声干扰抑制不足、时序依赖信息挖掘不充分、关键权重分配缺失等问题，该研究提出了一种双向时序数据驱动的玻璃温室环境变量预测模型，并依托浙江省谭家湾云上农业试验场进行实地试验。首先基于物联网边缘计算的智能感知网络架构，采集了268个传感器共计40417105条数据；其次通过Spearman秩相关系数对环境因子进行了相关性分析，选取具有显著相关性的环境因子作为输入特征；随后将经过变分模态分解（variational mode decomposition，VMD）分解的4个本征模态函数（intrinsic mode function，IMF）特征序列拼接后输入双向长短期记忆网络（bidirectional long short-term memory，BiLSTM）模型，并引入注意力机制对BiLSTM输出的隐藏状态向量分配关键权重，最后得到空气温度、空气湿度、CO₂浓度和光照强度的时间序列预测结果。试验结果表明，该研究所提出的模型在选取的4个环境变量预测上的平均决定系数达到了0.976，平均绝对百分比误差为2.606%，研究成果可为温室环境管理与调控提供参考。

Abstract: Glass greenhouses can offer a controllable growth environment for the crops in modern agriculture. It is often required to accurately predict the internal environmental parameters for high crop yield and quality. However, the substantial challenges are still remained to predict the greenhouse environmental conditions, due to the non-stationary data sensitive to the noise interference in crop production. Conventional predictions cannot fully meet the requirements of the refined regulation in modern greenhouses. Particularly, it is also lacking in the suppression of the noise interference, the mining of time-series dependency information, and the presence of the effective key-weight assignment. Consequently, it is often required for the high performance of the prediction on the greenhouse environmental parameters. In this study, a bidirectional time-series data-driven prediction model was proposed for the environmental variables in glass greenhouses. Field experiments were conducted at the Tanjiawan Cloud Agriculture Test Base in Zhejiang Province, China. The cloud greenhouse was then integrated with the data platform. An environmental monitoring network was established in an 80 m×104 m glass greenhouse. A multi-source sensor architecture was also constructed using Internet of Things (IoT) edge computing. Environmental data was collected by the edge computing nodes, then uploaded to the base stations via 5G networks, and finally forwarded to the big data servers for centralized storage using a MySQL database. A total of 40417105 pieces of raw observation data were collected after preliminary data cleaning. The nonlinear correlations were considered among multiple variables in the greenhouse. Spearman’s correlation coefficient was selected to evaluate the correlations between environmental factors and their potential nonlinear relationships. A statistical significance level of P≤0.05 was set during evaluation. A correlation coefficient |r|≥0.5 was regarded as the practical relevance. Environmentally significant factors with high correlation coefficients were selected as the input features to reduce the model input redundancy. Finally, the input feature time series were decomposed into four intrinsic mode functions (IMF) components using variational mode decomposition (VMD) modal decomposition. The non-stationarity and noise interference of the series were reduced to retain the multi-scale feature information. The decomposed IMF feature sequences were input into the bidirectional long short-term memory (BiLSTM) model for the prediction. BiLSTM was used to establish the time series features of each IMF component. The bidirectional dynamic dependency relationships of the time series were captured after the forward and reverse long short-term memory (LSTM) layers. Subsequently, an attention mechanism was introduced to assign the key weights to the hidden state vectors output by the BiLSTM using the correlation between the sequence data and the current prediction. A weighted average calculation was then performed according to this weight distribution. The time series were predicted on the air temperature, air humidity, CO₂ concentration, and light intensity. And then they were output into the fully connected layer. The test results showed that the best performance was achieved in the four environmental prediction tasks. The various indicators of the model were improved significantly, compared with the five control models of the LSTM, BiLSTM, empirical mode decomposition-bidirectional long short-term memory (EMD-BiLSTM), complete ensemble empirical mode decomposition with adaptive noise-bidirectional long short-term memory (CEEMDAN-BiLSTM), and variational mode decomposition-bidirectional long short-term memory (VMD-BiLSTM). Among them, the best fitting shared the effect on the air temperature and air humidity, with the determination coefficients of 0.986 and 0.981, respectively. The average determination coefficient (R²) of the four environmental variables reached 0.976, which was improved by 0.067, 0.043, 0.033, 0.026, and 0.013, respectively, compared with the control models. The average mean absolute percentage error (MAPE) was 2.606%, with the decrease of 6.279, 5.606, 3.665, 2.493, and 1.810 percentage points, respectively, compared with the control models. All indicators performed better than those of the control models, indicating the high accuracy of environmental factor prediction. The more accurate and efficient prediction was realized on the trends of the key environmental factors in the greenhouse. The best conditions can offer to enhance the efficiency and quality of crop growth in agricultural production.

HTML全文

参考文献(39)

施引文献

资源附件(0)