Abstract:
Glass greenhouses can offer a controllable growth environment for the crops in modern agriculture. It is often required to accurately predict the internal environmental parameters for high crop yield and quality. However, the substantial challenges are still remained to predict the greenhouse environmental conditions, due to the non-stationary data sensitive to the noise interference in crop production. Conventional predictions cannot fully meet the requirements of the refined regulation in modern greenhouses. Particularly, it is also lacking in the suppression of the noise interference, the mining of time-series dependency information, and the presence of the effective key-weight assignment. Consequently, it is often required for the high performance of the prediction on the greenhouse environmental parameters. In this study, a bidirectional time-series data-driven prediction model was proposed for the environmental variables in glass greenhouses. Field experiments were conducted at the Tanjiawan Cloud Agriculture Test Base in Zhejiang Province, China. The cloud greenhouse was then integrated with the data platform. An environmental monitoring network was established in an 80 m×104 m glass greenhouse. A multi-source sensor architecture was also constructed using Internet of Things (IoT) edge computing. Environmental data was collected by the edge computing nodes, then uploaded to the base stations via 5G networks, and finally forwarded to the big data servers for centralized storage using a MySQL database. A total of
40417105 pieces of raw observation data were collected after preliminary data cleaning. The nonlinear correlations were considered among multiple variables in the greenhouse. Spearman’s correlation coefficient was selected to evaluate the correlations between environmental factors and their potential nonlinear relationships. A statistical significance level of
P≤0.05 was set during evaluation. A correlation coefficient |
r|≥0.5 was regarded as the practical relevance. Environmentally significant factors with high correlation coefficients were selected as the input features to reduce the model input redundancy. Finally, the input feature time series were decomposed into four intrinsic mode functions (IMF) components using variational mode decomposition (VMD) modal decomposition. The non-stationarity and noise interference of the series were reduced to retain the multi-scale feature information. The decomposed IMF feature sequences were input into the bidirectional long short-term memory (BiLSTM) model for the prediction. BiLSTM was used to establish the time series features of each IMF component. The bidirectional dynamic dependency relationships of the time series were captured after the forward and reverse long short-term memory (LSTM) layers. Subsequently, an attention mechanism was introduced to assign the key weights to the hidden state vectors output by the BiLSTM using the correlation between the sequence data and the current prediction. A weighted average calculation was then performed according to this weight distribution. The time series were predicted on the air temperature, air humidity, CO
2 concentration, and light intensity. And then they were output into the fully connected layer. The test results showed that the best performance was achieved in the four environmental prediction tasks. The various indicators of the model were improved significantly, compared with the five control models of the LSTM, BiLSTM, empirical mode decomposition-bidirectional long short-term memory (EMD-BiLSTM), complete ensemble empirical mode decomposition with adaptive noise-bidirectional long short-term memory (CEEMDAN-BiLSTM), and variational mode decomposition-bidirectional long short-term memory (VMD-BiLSTM). Among them, the best fitting shared the effect on the air temperature and air humidity, with the determination coefficients of 0.986 and 0.981, respectively. The average determination coefficient (
R2) of the four environmental variables reached 0.976, which was improved by 0.067, 0.043, 0.033, 0.026, and 0.013, respectively, compared with the control models. The average mean absolute percentage error (MAPE) was 2.606%, with the decrease of 6.279, 5.606, 3.665, 2.493, and 1.810 percentage points, respectively, compared with the control models. All indicators performed better than those of the control models, indicating the high accuracy of environmental factor prediction. The more accurate and efficient prediction was realized on the trends of the key environmental factors in the greenhouse. The best conditions can offer to enhance the efficiency and quality of crop growth in agricultural production.