Solar greenhouse environment prediction model based on SSA-LSTM-Attention
-
摘要:
建立准确的温室环境预测模型有助于精准调控温室环境促进作物的生长发育,针对温室小气候具有时序性、非线性和强耦合等特点,该研究提出了一种基于SSA-LSTM-Attention(sparrow search algorithm-long short-term memory-attention mechanism)的日光温室环境预测模型。首先,通过温室物联网数据采集系统获取温室内外环境数据;其次,使用皮尔逊相关性分析法筛选出强相关性因子;最后,构建环境特征时间序列矩阵输入模型进行温室环境预测。对日光温室的室内温度、室内湿度、光照强度和土壤湿度4种环境因子的预测,SSA-LSTM-Attention模型的平均拟合指数达到了97.9%。相较于反向传播神经网络(back propagation neural network,BP)、门控循环单元(gate recurrent unit,GRU)、长短期记忆神经网络(long short term memory,LSTM)和LSTM-Attention(long short-term memory-attention mechanism)模型,分别提高了8.1%、4.1%、3.5%、3.0%;平均绝对百分比误差为2.6%,分别比4组对照模型低6.5%、3.2%、2.8%、2.5%。试验结果表明,通过利用SSA自动优化LSTM-Attention模型的超参数,提高了模型预测精度,为日光温室环境超前调控提供了有效的数据支持。
Abstract:Establishing an accurate greenhouse environment prediction model is essential for precisely regulating the greenhouse environment to promote the growth and development of crops. Given the distinctive attributes of greenhouse microclimates, which encompass temporality, nonlinearity, and strong coupling among various environmental factors, this paper introduces a novel and sophisticated solar greenhouse environment prediction model based on the integration of SSA-LSTM-Attention (sparrow search algorithm-long short-term memory-attention mechanism). The overarching objective of this research was to develop a model that could reliably predict key environmental parameters within greenhouses, thereby enabling more informed and effective environmental management practices. To achieve this, we embarked on a comprehensive data collection process, utilizing an IoT (internet of things) data acquisition system to gather extensive environmental data from both inside and outside the greenhouse. At first, we applied Pearson correlation analysis to sift through the vast amount of data and identify the most significantly correlated factors. This step allowed us to focus on the variables that have the greatest impact on the greenhouse environment, thereby streamlining the model and improving its efficiency. Then we constructed a time series dataset of environmental features, which encapsulated the intricate temporal dynamics and interdependencies within the data. This dataset served as the input for our SSA-LSTM-Attention model, enabling it to learn and capture the complex patterns and relationships within the data. The model itself was a fusion of three powerful components: the sparrow search algorithm (SSA) for hyperparameter optimization, the long short term memory (LSTM) network for handling temporal dependencies, and the attention mechanism for enhancing the model's ability to focus on the most relevant information within the input sequence. The results of our study were nothing short of impressive. For the prediction of four vital environmental factors in solar greenhouses: indoor temperature, indoor humidity, light intensity, and soil moisture: the SSA-LSTM-Attention model achieved an exceptional average fitting index of 97.9%. This outstanding performance represented a substantial and statistically significant improvement over other benchmark models, including BP (Back propagation neural network), GRU (gate recurrent unit), LSTM, and LSTM-Attention (long short-term memory-attention mechanism). Specifically, the SSA-LSTM-Attention model outperformed BP, GRU, LSTM, and LSTM-Attention by 8.1%, 4.1%, 3.5%, and 3.0%, respectively, in terms of prediction accuracy. Furthermore, the model exhibited a remarkably low average absolute percentage error of 2.6%, which was significantly lower than the errors recorded by the four control models, with reductions of 6.5%, 3.2%, 2.8%, and 2.5%, respectively. The experimental results effectively demonstrate the optimization effect of the sparrow search algorithm and the improvement of LSTM prediction accuracy by the attention mechanism. In conclusion, the integration of the sparrow search algorithm for automatic hyperparameter optimization of the LSTM-Attention model not only showcases the transformative potential of advanced machine learning techniques in addressing complex agricultural challenges but also provides invaluable data support for proactive environmental regulation in solar greenhouses. By enabling farmers and researchers to more precisely control and optimize critical environmental factors that influence crop growth, our model holds the promise of significantly enhancing crop yields, improving resource use efficiency, and ultimately contributing to the sustainability and profitability of greenhouse farming operations.
-
0. 引 言
中国设施农业的总面积截至2022年已达
4000 多万亩,达到世界设施农业总面积的80%以上,其中,日光温室是设施农业的重要组成部分,是冬季华北地区蔬菜的生产的主要场所[1-2]。日光温室环境因子的变化会直接影响作物的产量和质量,室内温度和室内湿度影响温室内作物的蒸腾作用和气孔开闭,光照强度影响作物光合作用的进行,土壤湿度是作物生长发育的基础。以上4种环境因子相较于土壤碳通量等复杂环境因子,其检测设备成熟,成本相对较低;在此基础上构建日光温室环境预测模型可靠性高、实用性强,有助于实现温室气候稳定、促进作物产量提升[3-5]。温室环境因子建模的方法主要有机理模型和数据驱动模型[6-9]。机理模型主要是从物理、化学和生物学原理的角度出发结合能量和质量守恒原理描述温室内环境的变化趋势。LIU等[10]基于能量守恒的壁面温度估算方法,构建了一种温室瞬态气候模型,实现了日光温室空气温湿度的预测。ZHAO等[11]构建了日光温室一维瞬态温度预测模型,实现了温室内各表面的日温度变化趋势预测。MAO等[12]利用计算流体力学(computational fluid dynamics,CFD)分析温室传热特性,提出了一种基于CFD模拟和实测数据结合的温室温湿度动态建模方法。以上研究在模型构建过程中需要考虑温室内多种物理过程和参数,建模过程较为复杂,不利于实际生产中应用。
温室环境数据具有数据量大、非线性、时序性和耦合性等特点,数据驱动模型相较于机理模型更加适合温室环境的预测且精度较高。近年来,包括卷积神经网络(convolutional neural networks,CNN)、 递归神经网络(recurrent neural networks,RNN)、长短期记忆(long short term memory,LSTM)和门控递归单元(gate recurrent unit,GRU)在内的深度学习算法广泛应用于处理复杂时间序列预测的任务中[13-14]。FANG等[15]提出了一种基于长短期记忆网络的序列到序列模型,对温室内温度的进行预测。胡瑾等[16]提出了基于1D-CNN-GRU(one dimensional convolutional neural networks-gated recurrent unit)的日光温室温度预测模型,进行未来1~4 h的温度预测。上述研究表明了深度学习方法适用于时间序列预测,但只对单一环境因子进行预测且未考虑输入环境因子的权重问题,导致模型的泛化能力弱,实用性较低。因此,YANG等[17]提出了基于前馈注意机制-长短期记忆神经网络(feed-forward attention mechanism- long-short term memory,FAM-LSTM)模型,用于同时预测日光温室温度和湿度。张观山等[18]提出了基于LSTM-Attention(long short-term memory-attention mechanism)的温室空气温度预测模型,其均方误差(mean squared error,MSE)值为0.51 ℃,相较于传统 LSTM预测精度显著提升。考虑到循环神经网络人工调参难,收敛速度慢、易陷入局部最优及过拟合等问题,多数研究者提出采用生物启发式算法,如遗传算法(genetic algorithm,GA)[19]、粒子群优化算法(particle swarm optimization,PSO)[20]、麻雀搜索算法(sparrow search algorithm,SSA)[21] 等对循环神经网络进行优化。祖林禄等[22]提出基于 SSA-LSTM(sparrow search algorithm-long short-term memory)的日光温室环境预测模型,实现了对6种环境参数的准确预测。许泽海等[23]提出了基于SSA优化BP的预测模型,实现了植株茎干水分含量的精准预测。
目前传统的时间序列预测模型的预测精度依赖人工经验手动调节参,不确定性高;且随预测时间序列的增长会降低预测精度[24-26]。因此本研究选择构建与注意力机制结合的LSTM模型,通过调整输入环境因子的权重,提高了长时间序列预测的精度,并利用SSA优化模型参数,自动调参优化,避免了人工调参影响,以实现温室多维环境数据的准确预测。
1. 数据获取与处理
1.1 试验地点
试验于2023年9月1日—2024年4月1日在北京市平谷区峪口镇设施农业智慧云服务系统产业化博士农场(117 °01 ′E,40 °17 ′N)的2号日光温室内进行,温室内种植作物为草莓。温室为新型冷弯刚结构装配式日光温室,东西长80 m,南北跨度12 m,脊高为5.2 m。日光温室后墙和前坡均铺设佳泰牌3层棉芯保温被,侧面和顶部开有通风口,同时温室也具备卷帘机和水肥灌溉等设备。
1.2 试验数据采集系统
为保证物联网数据采集系统所采集环境数据的准确性,将3组物联网数据采集设备分别布设在日光温室室内自东向西20、40、60 m,高度1.5 m处。土壤温湿度传感器在深度为12 cm深度的土层进行数据采集。物联网数据采集系统由环境数据采集模块、传输模块和农业智慧云平台构成。
日光温室内环境数据采集选用北京昆仑海岸传感器技术有限公司的JWSK-V、OSA-1W、ZD-6型传感器。室外环境数据采集选用聚英电子科技有限公司的小型气象站,采集参数包括空气温湿度、光合有效辐射、风速、风向、降雨量;温室内外传感器主要技术参数如表1所示。环境数据采集模块将数据传输至物联网网关后通过GPRS(general packet radio service)技术传输至基站,服务器与基站进行通信并将数据保存到MySQL数据库,最终在智慧云平台和微信小程序端展示,温室环境数据采集流程如图1所示。
表 1 传感器主要技术参数Table 1. Main technical parameters of the sensors位置
Position传感器类型
Sensor type型号
Model量程
Range精度
Precision
室内空气温湿度传感器 JWSK-V 温度:−20 ~ 60 ℃
湿度:0~100%±0.5 ℃
±3%土壤温湿度传感器 OSA-1W 温度:−40 ~ 90 ℃
湿度:0~100%±0.2℃
±2%光照传感器 ZD-6 光照:0~ 140000 lx± 7000 lx
室外气象百叶盒采集器 JYBYH-WS4-RS 温度:−40~120℃
湿度:0~100%RH
大气压强:
300~1 100 HPa±0.3℃
±2.0%
±1HPa光合辐射传感器 JYS-GH-RS 光合辐射:0~ 2500 W·m−2±1 W·m−2 雨量传感器 JY-YX 0~4/(mm·min−1) ±0.32/(mm·min−1) 风向传感器 JY-FX-ARS 风向:0~360° ±1° 风速传感器 JY-FS-ARS 风速:0~30/(m·s−1) ±0.2/(m·s−1) 1.3 数据预处理
为实现草莓种植的科学化管理,本试验以草莓整个生长周期内的环境因子为研究对象,连续采集7个月环境数据,采样间隔为5 min,共
61344 组数据,每组数据包含12个环境因子。1.3.1 缺失值处理
为保证填充数据的可行性,避免数据缺失过多造成长距离虚假插值,本研究利用箱线图去除所采集数据的异常值,再采用均值填充、线性插值以及离缺失值最近位置且天气条件相同的数据进行缺失值填充。
1.3.2 相关性分析
对采集的环境数据利用皮尔逊相关系数法(Pearson correlation coefficient,PCC)进行分析[27],皮尔逊相关系数可反应连续性随机变量之间的线性相关性,其计算式为
$$ {R}_{xy}=\frac{E\left[\left(X-\bar{X}\right)\left(Y-\bar{Y}\right)\right]}{\sigma \left(X\right)\cdot \sigma \left(Y\right)} $$ (1) 式中X和Y为两个连续性随机变量,$ \stackrel{-}{X} $和$ \stackrel{-}{Y} $为X和Y均值,E[(X - $ \stackrel{-}{X} $)(Y - $ \stackrel{-}{Y} $)]为两个变量之间的协方差,$ \mathrm{\sigma }\left(X\right) $和$ \mathrm{\sigma }\left(Y\right) $为X和Y的方差。皮尔逊相关系数的取值范围为-1~1,当趋近于1时,表明变量之间的相关性越强,相关系数与相关性之间的关系标准如表2所示。
表 2 相关系数Rxy与相关度对应表Table 2. Correlation coefficient Rxy and correlation degree correspond to the table相关程度
Degree of correlation相关系数
Correlation coefficient极强相关Extremely strong correlation 0.8<$ \left|{R}_{xy}\right| $<1 强相关Strong correlation 0.6<$ \left|{R}_{xy}\right|\le $0.8 中度相关Medium correlation 0.4<$ \left|{R}_{xy}\right|\le $0.6 弱相关Weak correlation 0.2<$ \left|{R}_{xy}\right|\le $0.4 极弱相关Extremely weak correlation 0<$ \left|{R}_{xy}\right|\le $0.2 1.3.3 归一化处理
为保证数据的一致性,提高处理数据效率,对输入的环境因子进行归一化处理,将数据映射到0~1区间,若输入数据为X1,X2,X3...XN,则归一化方法计算式如下:
$$ X=\frac{{X}_{i}-\mathrm{M}\mathrm{i}\mathrm{n}\left({X}_{c}\right)}{\mathrm{M}\mathrm{a}\mathrm{x}\left({X}_{c}\right)-\mathrm{M}\mathrm{i}\mathrm{n}\left({X}_{c}\right)} $$ (2) 式中Max,Min为训练集中的最大和最小值,X为归一化之后的数据,c为正整数。
2. 环境预测模型构建
2.1 SSA-LSTM-Attention模型构建
本试验在构建环境参数预测模型时分为两步,第一步为LSTM模型中加入注意力机制,利用注意力机制对输入的环境数据进行特征分析,以概率分布的方式为每个特征分配相应权重,补偿LSTM在预测长时间序列因信息丢失造成的精度损失。第二步为在LSTM模型中加入SSA算法,利用SSA算法优化调参,解决模型因人工手动调参的随机性影响精确的问题,进一步提高模型的预测精度。
LSTM由RNN演变而来,其能够捕捉更长期的依赖关系,一定程度上解决了RNN梯度消失和梯度爆炸问题。LSTM通过引入门控机制,使其在学习长程时间序列数据时具有更强的记忆能力和稳定性。LSTM的门控机制通过遗忘门、输入门和输出门三个关键门来控制信息的流动组成[28]。遗忘门可决定哪些信息从记忆单元中丢弃,输入门控制新信息的存储,输出门决定哪些信息从记忆单元输出。这使得LSTM在预测精度和模型泛化能力上均优于RNN等传统时间序列预测算法,尤其在面对具有复杂时间依赖性的序列数据时,LSTM能够提供更为准确的预测结果。
注意力机制(attention mechanism)的灵感源于动物视觉的注意力[29]。注意力机制能够使模型更加聚焦于输入序列中对当前任务最为关键的信息,从而提高模型对重要特征的识别能力,增强预测的准确性,使得LSTM能够更好地处理长输入序列,适应不同长度和复杂度的序列数据。注意力机制的原理如图2所示。
图 2 注意力机制结构图注:X1, Xt-1, Xt ... XN为输入序列;h1, ht-1, ht ... hN为输入序列的隐藏层状态值;a1, at-1, at ... aN为隐藏层注意力概率分布;$ \oplus $代表矩阵对应元素相加;yt为经优化后的输出序列。Figure 2. The structure of the attention mechanismNote: X1, Xt-1, Xt ... XN is the input sequence; h1, ht-1, ht ... hN is the hidden state of the input sequence; a1, at-1, at ... aN is the probability distribution of hidden layer attention; $ \oplus $ symbol represents element-wise addition between matrices; yt is the optimized output sequence.LSTM-Attention模型的结构包括5层,分别是输入层、LSTM层、Attention层、全连接层、输出层,如图3所示。
图 3 LSTM-Attention模型结构图注:输入层为多维时间序列特征变量转换而来的的三维阵列;LSTM1为第一层长短期记忆神经网络层;LSTM2为第二层长短期记忆神经网络层;dropout为丢弃层;$\oplus $代表矩阵对应元素相乘;$\otimes $代表矩阵对应元素相加。Figure 3. The structure of the LSTM-Attention modelNote: Input layer is a three-dimensional array transformed from multidimensional time series feature variables; LSTM1 is the first layer of long short-term memory neural network; LSTM2 is the second layer of long short-term memory neural network; dropout is dropout layer; $\oplus $symbol represents element-wise addition between matrices; $\otimes $ symbol represents element-wise addition between matrices.1)输入层:模型输入为多维时间序列特征变量,将输入数据转换适合LSTM层计算的三维阵列(S、T、X)[30],其中,S为输入样本的个数,T为时间维度,X为特征维度。
2)LSTM层:包括两个LSTM层以及2个丢弃层,在LSTM后加上丢弃层可以提高模型的泛化能力,防止模型出现过拟合情况;
3)Attention层:通过式(3)、(4)计算出隐藏层输出、和向量之间相关性,通过式(5)将时间步的输出与相应的权重进行加权求和得到输出结果;
4)全连接层:对Attention层的输出结果进行特征提取组合以及数据维度的转换;
5)输出层:通过式(6)输出环境预测模型结果yt。整个流程计算式为
$$ {d}_{t}=\mathrm{u}\mathrm{t}\mathrm{a}\mathrm{n}\mathrm{h}({W}_{X}{h}_{t}+b) $$ (3) $$ {\alpha }_{t}=\frac{\mathrm{e}\mathrm{x}\mathrm{p}\left({d}_{t}\right)}{\displaystyle\sum\nolimits_{i=1}^{t}{d}_{t}} $$ (4) $$ {v}_{t}={\sum }_{t=1}^{j}{\alpha }_{t}{h}_{t} $$ (5) $$ {y}_{t}=\sigma ({W}_{o}{h}_{t}+b) $$ (6) 式中ht、dt、vt、yt分别为隐藏层输出、b时刻向量之间相关性、Attention层的输出、最终输出的预测值。u、WX、Wo、b为权重系数、为权重矩阵和偏置。
SSA算法的灵感来源于麻雀群体的觅食和反捕食行为,包括发现者、跟随者、警戒者[21],相较传统的优化算法,SAA算法具有迭代次数少、收敛速度快和搜索能力强的特点,缩短了模型训练的时间[31]。在麻雀寻找食物的试验中,设麻雀的数量为N,变量的维数为D,则麻雀的位置为XN,D,种群的适应度FX可表示为
$$ {\boldsymbol{F}}_{\boldsymbol{X}}=\left[\begin{array}{c}\begin{array}{c}f\left(\right[{X}_{\mathrm{1,1}}{X}_{\mathrm{1,2}}\cdots {X}_{1,D}\left]\right)\\ f\left(\right[{X}_{\mathrm{2,1}}{X}_{\mathrm{2,2}}\cdots {X}_{2,D}\left]\right)\\ \vdots \vdots \vdots\end{array}\\ f\left(\right[{X}_{N,1}{X}_{N,2}\cdots {X}_{N,D}\left]\right)\end{array}\right] $$ (7) 在每次迭代期间,发现者$ {X}_{p,q}^{t} $,跟随者$ {X}_{p,c}^{t} $和警戒者$ {X}_{p,d}^{t} $的位置会进行更新,新一轮的位置在上一轮位置的基础上进行计算,t+1轮的位置计算式为
$$ {X}_{p,q}^{t+1}=\left\{\begin{aligned} & {X}_{p,q}^{t}\cdot \mathrm{exp}\left(\frac{-p}{a\cdot {t}_{max}}\right),{R}_{2} < {S}_{T}\\ & {X}_{p,q}^{t}+Q,{R}_{2}\ge {S}_{T} \end{aligned}\right. $$ (8) $$ {X}_{p,c}^{t+1}=\left\{\begin{aligned} & Q\cdot \mathrm{exp}\left(\frac{{X}_{worst}^{t}-{X}_{p,c}^{t}}{{p}^{2}}\right),p > \frac{N}{2}\\ & {X}_{b}^{t+1}+\left|{\mathrm{X}}_{p,c}^{t}-{X}_{b}^{t+1}\right|*{\boldsymbol{A}}^+,p\le \frac{N}{2}\end{aligned}\right. $$ (9) $$ {X}_{p,d}^{t+1}=\left\{\begin{aligned} & {X}_{best}^{t}+\beta \left|{\mathrm{X}}_{p,d}^{t}-{X}_{best}^{t}\right|,{f}_{p}\ne {f}_{g}\\ & {X}_{p,d}^{t}+K\left(\frac{\left|{X}_{p,d}^{t}-{X}_{worst}^{t}\right|}{\left({f}_{p}-{f}_{w}\right)+\epsilon }\right),{f}_{p}={f}_{g}\end{aligned}\right. $$ (10) 式中a、Q为随机数,分别服从均匀分布和正态分布。p,q代表当前坐标,tmax为最大迭代次数,R2和ST分别为警戒值和安全阈值,Xb、Xworst、$ {X}_{best}^{t} $为发现者最佳、全局最差、全局最优位置,A为数值为{-1,1}的1*D矩阵,β是步长控制参数,服从正态随机分布,K为[-1,1]的随机数,为群体觅食的方向及步长控制系数,ε为无穷小常数,fp、fg、fw分别为当下个体、最佳、最差位置适应度值。
SSA算法通过迭代搜索来寻找训练参数组合最优解,有利于提高模型的预测精度。整个预测过程流程图如图4所示。
SAA算法将对模型训练的超参数神经元个数、学习率和输入批量进行优化。SAA的初始化包括初始化种群全局最优位置、初始位置和麻雀种群数量等参数,然后定义边界函数,遍历并检查参数是否在上下界之间。以均方差为适应度函数,通过适应度函数对麻雀个体适应度进行评估和排序。在达到最大迭代次数时停止迭代,并输出全局最优解,将新的超参数传给LSTM-Attention模型并进行新一次训练,最后将输出的预测结果与基线模型的进行结果对比。
2.2 模型评价指标
为了验证利用SSA算法优化以及加入注意力机制的长短期记忆网络的预测性能,采用决定系数(Coefficient of Determination,R2)、均方根误差(root mean square error,$ \mathrm{R}\mathrm{M}\mathrm{S}\mathrm{E} $)、平均绝对百分比误差(mean absolute percentage error,$ \mathrm{M}\mathrm{A}\mathrm{P}\mathrm{E} $)评估模型的预测性能。R2、RMSE、MAPE计算式如下所示。
$$ {R}^{2}=\frac{\displaystyle\sum\nolimits_{i=1}^{N}{({X}_{i}-{y}_{i})}^{2}}{\displaystyle\sum\nolimits_{i=1}^{N}({X}_{i}-\bar{{y}_{i}})} $$ (11) $$ \mathrm{R}\mathrm{M}\mathrm{S}\mathrm{E}=\sqrt{\frac{1}{N}\sum\nolimits _{i=1}^{N}({X}_{i}-{y}_{i})} $$ (12) $$ \mathrm{M}\mathrm{A}\mathrm{P}\mathrm{E}=\frac{100\text{%}}{N}\sum\nolimits _{i=1}^{N}\frac{\left|{y}_{i}{-X}_{i}\right|}{{X}_{i}} $$ (13) 式中N为样本数量,Xi和yi为当前时刻的实测值和预测值。
3. 试验结果与分析
3.1 影响因素的相关性分析
试验数据为3组传感器的平均值,将试验数据的时间频率由5 min转换为30 min,可得单个环境因子为
10224 条数据。利用皮尔逊相关系数法对环境因子进行相关性分析,各个环境因子之间的相关系数如表3所示。其中,室内温度与土壤温度、光照强度、室外温度和室外光合辐射相关性较强。室内湿度与室内温度、光照强度、室外光合辐射相关性较强。光照强度与室内温度、土壤温度、室外温度、室外光合辐射、室外风速相关性较强。因此预测上述3种环境因子时分别选择与其相关性较强的环境因子与其本身作为输入。而土壤湿度与土壤温度、室内温度、室外温度、室外光合辐射呈现中度相关,与光照强度呈现弱相关性,主要是因为温室内采用无土栽培和膜下滴灌灌溉,覆膜处理使土壤湿度蒸发较少减小了与其他环境因子的关联程度[32],因此,土壤湿度预测选择土壤湿度、土壤温度、室内温度、室外温度、室外光合辐射5种环境因子作为输入。表 3 各环境因子与预测环境因子相关系数Table 3. Correlation coefficients of each factor and predicted environmental factors环境因子
Influence factor预测环境因子
Predictive environmental factor室内温度
Indoor
temperature室内湿度
Indoor
humidity光照强度
Light
intensity土壤湿度
Soil
moisture室内温度
Indoor temperature1.00 −0.88 0.91 0.49 室内湿度
Indoor humidity−0.88 1.00 −0.86 −0.53 光照强度
Light intensity0.91 −0.86 1.00 0.37 土壤湿度
Soil moisture0.49 −0.53 0.37 1.00 土壤温度
Soil temperature0.84 −0.74 0.64 0.64 室外温度
Outdoor temperature0.75 −0.62 0.57 0.45 室外湿度
Outdoor humidity−0.47 0.63 −0.44 −0.46 室外光合辐射
Outdoor light intensity0.82 −0.80 0.92 0.44 室外风向
Outdoor wind direction0.16 −0.17 0.07 0.08 室外风速
Outdoor wind speed0.40 −0.40 0.35 −0.29 室外降雨量
Outdoor rainfall−0.02 0.02 −0.02 −0.02 室外大气压
Outdoor atmosphere−0.07 0.07 −0.33 −0.03 3.2 模型训练及优化
日光温室环境因子模型采用Python3.11编写,开发环境为Pycharm,开发框架为TensorFlow。SSA优化LSTM-Attention模型参数训练时,将原始数据集以9∶1划分为训练集和测试集,激活函数为SELU(scaled exponential linear unit),模型编译优化器为Adam(adaptive moment estimation)。模型训练过程中,具体参数设置为:麻雀总数为20,其中生产者比例为20%,预警者位置随机产生为15%,最大迭代次数为100,搜索维度为3,分别是学习率范围是[0.001,0.01],神经元个数范围[10,100],输入批量范围[32,512],最大训练轮数100。训练过程中使用边界函数判断所有参数是否在上下界之间,预测结果的均方差作为麻雀适应度,训练过程中当适应度连续3轮保持数值不变时停止训练。SSA优化LSTM-Attention模型的超参数如表4所示。
表 4 SSA-LSTM-Attention 优化结果Table 4. Optimization results of SSA-LSTM-Attention环境因子
Environmental factor适应度
Fitness神经元数
Neurons输入批量
Batch size学习率
Learning rate室内温度
Indoor temperature4.06 67 128 0.0036 室内湿度
Indoor humidity5.15 77 128 0.0014 光照强度
Light intensity4. 91 45 128 0.0019 土壤湿度
Soil moisture8.45 42 128 0.0031 BP、LSTM、GRU、LSTM-Attention环境预测模型的运行环境与本研究建立的模型相同,参数设置为迭代次数均为100,学习率为0.01,输入批量为32,优化器采用Adam,GRU和LSTM神经元的数量设置为 128,最大训练轮数设置为 100。
3.3 不同算法预测结果对比分析
将SSA优化结果作为LSTM-Attention模型参数,分别对室内温度、室内湿度、光照强度、土壤湿度2024年2月16日—2月26日共10 d的变化趋势进行预测,预测结果如图5所示。为验证预测模型准确性,将SSA-LSTM-Attention模型分别与BP、LSTM、GRU、LSTM-Attention模型对比,各模型的评价指标如表5所示。
根据表5可知,相比于其他几种模型采用SSA-LSTM-Attention模型预测时准确率最高。其中,预测室内温度时效果最好,拟合指数达到了98.6%,分别比BP、LSTM、GRU、LSTM-Attention模型高7.9、4.3、3.7、3.1个百分点。RMSE为0.6,相较于BP、LSTM、GRU、LSTM-Attention模型分别低了0.9、0.5、0.4、0.2。土壤湿度的预测效果相对稍低,但拟合指数仍达到了95.1%,分别比BP、LSTM、GRU、LSTM-Attention模型高7.2、2.6、2.1、1.5个百分点。RMSE为0.7,相较于BP、LSTM、GRU、LSTM-Attention模型分别低了0.7、0.4、0.4、0.2。SSA-LSTM-Attention模型在预测4个环境参数时平均拟合指数达到了97.9%,平均MAPE为2.6%,平均拟合指数分别比BP、LSTM、GRU、LSTM-Attention模型高8.1、4.1、3.5、3个百分点,明显优于对照组模型,表明模型在时间序列方面具有良好的性能。
与传统的时间序列预测模型相比,添加Attention机制后LSTM模型的精度有了明显提升,长序列预测效果表现良好,达到了预期目标,有利于投入未来实际应用。与LSTM-Attention模型的对比中可以看出SAA算法的优化调参效果明显,优化后的模型有效提高了预测精度,节省了大量计算成本与人工投入。
表 5 5种预测模型性能对比Table 5. Performance comparison of 5 prediction models预测内容
Forecast content模型名称
Model nameRMSE MAPE/% R2/% 室内温度
Indoor temperature/℃BP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention1.5
1.1
1.0
0.8
0.68.3
4.6
4.2
4.1
2.290.7
94.3
95.1
95.5
98.6室内湿度
Indoor humidity/%BP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention2.8
1.7
1.5
1.3
1.08.6
5.3
5.0
4.5
2.490.6
94.0
94.6
95.1
98.2光照强度
Light intensity/lxBP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention1725.8 1254.3 1203.2 1083.5
603.079.3
6.0
5.9
5.7
2.690.1
94.3
94.8
95.5
98.4土壤湿度
Soil moisture/%BP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention1.4
1.1
0.9
0.9
0.710.0
7.1
6.5
6.1
3.287.9
92.5
93.0
93.6
95.1注:RMSE 为均方根误差;MAPE 为平均绝对百分比误差;R2为决定系数;BP为反向传播神经网络;LSTM为长短期记忆神经网络;GRU为门控循环单元;LSTM-Attention为长短期记忆神经网络与注意力机制相结合构建的模型;SSA-LSTM-Attention为麻雀搜索算法优化的LSTM-Attention。 Note: RMSE is root mean square error; MAPE is mean absolute percentage error; R2 is Coefficient of Determination; BP is back propagation neural network; LSTM is long short term memory; GRU is gate recurrent unit; LSTM-Attention is the model that combines LSTM and attention mechanism; SSA-LSTM-Attention is LSTM-Attention optimized by sparrow search algorithm. 4. 结 论
针对传统日光温室环境因预测方法依赖人工调参,长序列预测精度差的问题,本研究提出了一种基于SSA优化的LSTM-Attention温室环境预测模型。。
1)相较于传统的LSTM模型,Attention机制的引入可让模型在每个时间步或每个特征维度上分配不同的注意力权重,提高了模型预测的准确性,使得LSTM模型能够更好地处理变长输入序列。引入SSA算法进行优化调参,实现了LSTM-Attention模型的自动调参,进一步提高了预测精度。
2)将SSA-LSTM-Attention模型的预测效果分别与BP、LSTM、GRU、LSTM-Attention模型进行对比。结果表明,SSA-LSTM-Attention模型的平均拟合指数为97.9%,分别比BP、LSTM、GRU、LSTM-Attention模型高出8.1、4.1、3.5、3.0个百分点。平均MAPE为2.6%、分别比4组对照模型低6.5、3.2、2.8、2.5个百分点,结果表明SSA-LSTM-Attention的预测效果明显好于其他模型。因此,SSA-LSTM-Attention模型为温室环境的准确调控提供了理论依据,有助于提高温室果蔬的品质和生产效率。
-
图 2 注意力机制结构图
注:X1, Xt-1, Xt ... XN为输入序列;h1, ht-1, ht ... hN为输入序列的隐藏层状态值;a1, at-1, at ... aN为隐藏层注意力概率分布;$ \oplus $代表矩阵对应元素相加;yt为经优化后的输出序列。
Figure 2. The structure of the attention mechanism
Note: X1, Xt-1, Xt ... XN is the input sequence; h1, ht-1, ht ... hN is the hidden state of the input sequence; a1, at-1, at ... aN is the probability distribution of hidden layer attention; $ \oplus $ symbol represents element-wise addition between matrices; yt is the optimized output sequence.
图 3 LSTM-Attention模型结构图
注:输入层为多维时间序列特征变量转换而来的的三维阵列;LSTM1为第一层长短期记忆神经网络层;LSTM2为第二层长短期记忆神经网络层;dropout为丢弃层;$\oplus $代表矩阵对应元素相乘;$\otimes $代表矩阵对应元素相加。
Figure 3. The structure of the LSTM-Attention model
Note: Input layer is a three-dimensional array transformed from multidimensional time series feature variables; LSTM1 is the first layer of long short-term memory neural network; LSTM2 is the second layer of long short-term memory neural network; dropout is dropout layer; $\oplus $symbol represents element-wise addition between matrices; $\otimes $ symbol represents element-wise addition between matrices.
图 5 基于SSA-LSTM-Attention的温室环境预测结果
注:图5a~5 d模型输入时间范围为2024-02-01—2024-02-15;图5a~5 d模型预测时间范围为2024-02-16—2024-02-26。
Figure 5. Greenhouse environment prediction results based on SSA-LSTM-Attention
Note: The model's input time range of Fig.5a-5 d is 2024-02-01-2024-02-15; the model's predict time range of Fig.5a-5 d is 2024-02-16-2024-02-26.
表 1 传感器主要技术参数
Table 1 Main technical parameters of the sensors
位置
Position传感器类型
Sensor type型号
Model量程
Range精度
Precision
室内空气温湿度传感器 JWSK-V 温度:−20 ~ 60 ℃
湿度:0~100%±0.5 ℃
±3%土壤温湿度传感器 OSA-1W 温度:−40 ~ 90 ℃
湿度:0~100%±0.2℃
±2%光照传感器 ZD-6 光照:0~ 140000 lx± 7000 lx
室外气象百叶盒采集器 JYBYH-WS4-RS 温度:−40~120℃
湿度:0~100%RH
大气压强:
300~1 100 HPa±0.3℃
±2.0%
±1HPa光合辐射传感器 JYS-GH-RS 光合辐射:0~ 2500 W·m−2±1 W·m−2 雨量传感器 JY-YX 0~4/(mm·min−1) ±0.32/(mm·min−1) 风向传感器 JY-FX-ARS 风向:0~360° ±1° 风速传感器 JY-FS-ARS 风速:0~30/(m·s−1) ±0.2/(m·s−1) 表 2 相关系数Rxy与相关度对应表
Table 2 Correlation coefficient Rxy and correlation degree correspond to the table
相关程度
Degree of correlation相关系数
Correlation coefficient极强相关Extremely strong correlation 0.8<$ \left|{R}_{xy}\right| $<1 强相关Strong correlation 0.6<$ \left|{R}_{xy}\right|\le $0.8 中度相关Medium correlation 0.4<$ \left|{R}_{xy}\right|\le $0.6 弱相关Weak correlation 0.2<$ \left|{R}_{xy}\right|\le $0.4 极弱相关Extremely weak correlation 0<$ \left|{R}_{xy}\right|\le $0.2 表 3 各环境因子与预测环境因子相关系数
Table 3 Correlation coefficients of each factor and predicted environmental factors
环境因子
Influence factor预测环境因子
Predictive environmental factor室内温度
Indoor
temperature室内湿度
Indoor
humidity光照强度
Light
intensity土壤湿度
Soil
moisture室内温度
Indoor temperature1.00 −0.88 0.91 0.49 室内湿度
Indoor humidity−0.88 1.00 −0.86 −0.53 光照强度
Light intensity0.91 −0.86 1.00 0.37 土壤湿度
Soil moisture0.49 −0.53 0.37 1.00 土壤温度
Soil temperature0.84 −0.74 0.64 0.64 室外温度
Outdoor temperature0.75 −0.62 0.57 0.45 室外湿度
Outdoor humidity−0.47 0.63 −0.44 −0.46 室外光合辐射
Outdoor light intensity0.82 −0.80 0.92 0.44 室外风向
Outdoor wind direction0.16 −0.17 0.07 0.08 室外风速
Outdoor wind speed0.40 −0.40 0.35 −0.29 室外降雨量
Outdoor rainfall−0.02 0.02 −0.02 −0.02 室外大气压
Outdoor atmosphere−0.07 0.07 −0.33 −0.03 表 4 SSA-LSTM-Attention 优化结果
Table 4 Optimization results of SSA-LSTM-Attention
环境因子
Environmental factor适应度
Fitness神经元数
Neurons输入批量
Batch size学习率
Learning rate室内温度
Indoor temperature4.06 67 128 0.0036 室内湿度
Indoor humidity5.15 77 128 0.0014 光照强度
Light intensity4. 91 45 128 0.0019 土壤湿度
Soil moisture8.45 42 128 0.0031 表 5 5种预测模型性能对比
Table 5 Performance comparison of 5 prediction models
预测内容
Forecast content模型名称
Model nameRMSE MAPE/% R2/% 室内温度
Indoor temperature/℃BP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention1.5
1.1
1.0
0.8
0.68.3
4.6
4.2
4.1
2.290.7
94.3
95.1
95.5
98.6室内湿度
Indoor humidity/%BP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention2.8
1.7
1.5
1.3
1.08.6
5.3
5.0
4.5
2.490.6
94.0
94.6
95.1
98.2光照强度
Light intensity/lxBP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention1725.8 1254.3 1203.2 1083.5
603.079.3
6.0
5.9
5.7
2.690.1
94.3
94.8
95.5
98.4土壤湿度
Soil moisture/%BP
LSTM
GRU
LSTM-Attention
SSA-LSTM-Attention1.4
1.1
0.9
0.9
0.710.0
7.1
6.5
6.1
3.287.9
92.5
93.0
93.6
95.1注:RMSE 为均方根误差;MAPE 为平均绝对百分比误差;R2为决定系数;BP为反向传播神经网络;LSTM为长短期记忆神经网络;GRU为门控循环单元;LSTM-Attention为长短期记忆神经网络与注意力机制相结合构建的模型;SSA-LSTM-Attention为麻雀搜索算法优化的LSTM-Attention。 Note: RMSE is root mean square error; MAPE is mean absolute percentage error; R2 is Coefficient of Determination; BP is back propagation neural network; LSTM is long short term memory; GRU is gate recurrent unit; LSTM-Attention is the model that combines LSTM and attention mechanism; SSA-LSTM-Attention is LSTM-Attention optimized by sparrow search algorithm. -
[1] OMER A M. Analysis of development in solar greenhouses[J]. Academic Journal of Life Sciences, 2022, 8(2): 14-32.
[2] YUAN M, ZHANG Z, LI G, et al. Multi-parameter prediction of solar greenhouse environment based on multi-source data fusion and deep learning[J]. Agriculture, 2024, 14(8): 1245-1245. DOI: 10.3390/agriculture14081245
[3] KHUSHI S, KUMAR M H. Design of low cost IoT enabled greenhouse control system for precision agricultural research application[J]. IOP Conference Series: Materials Science and Engineering, 2022, 1272(1): 012004. DOI: 10.1088/1757-899X/1272/1/012004
[4] WANG K, YANG T, KONG S, et al. Air quality index prediction through TimeGAN data recovery and PSO-ptimized VMD-deep learning framework[J]. Applied Soft Computing, 2024: 112626.
[5] MUNOZ M, GUZMAN J L, SANCHEZ J A, et al. A new IoT-based platform for greenhouse crop production[J]. IEEE Internet of Things Journal, 2020, PP(99): 1-1.
[6] 胡瑾,杨永霞,李远方,等. 温室环境控制方法研究现状分析与展望[J]. 农业工程学报,2024,40(1):112-128. DOI: 10.11975/j.issn.1002-6819.202310214 HU Jin, YANG Yongxia, LI Yuanfang, et al. Current situation analysis and prospect of greenhouse environmental control methods[J]. Transactions of the Chinese Society of Agricultural Engineering, 2024, 40(1): 112-128. (in Chinese with English abstract) DOI: 10.11975/j.issn.1002-6819.202310214
[7] 张传帅,徐岚俊,李小龙,等. 日光温室主要环境参数对番茄本体长势的影响[J]. 中国农业大学学报,2019,24(10):118-124. DOI: 10.11841/j.issn.1007-4333.2019.10.14 ZHANG Chuanshuai, XU Lanjun, LI Xiaolong, et al. Effects of main environmental parameters on tomato bulk growth in solar greenhouse[J]. Journal of China Agricultural University, 2019, 24(10): 118-124. (in Chinese with English abstract) DOI: 10.11841/j.issn.1007-4333.2019.10.14
[8] JIA W, WEI Z. Short term prediction model of environmental parameters in typical solar greenhouse based on deep learning neural network[J]. Applied Sciences, 2022, 12(24): 12529-12529. DOI: 10.3390/app122412529
[9] 宗成骥,王建玉,宋卫堂,等. 基于天气预报的日光温室夜间逐时气温预测模型构建[J]. 农业工程学报,2022,38(Suppl.1):218-225. ZHONG Chengji, WANG Jianyu, SONG Weitang, et al. Construction and validation of hourly air temperature prediction model in solar greenhouse at night[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2022, 38(Suppl.1): 218-225. (in Chinese with English abstract)
[10] LIU R, LI M, GUZMAN J L, et al. A fast and practical one-dimensional transient model for greenhouse temperature and humidity[J]. Computers and Electronics in Agriculture, 2021, 186: 106186. DOI: 10.1016/j.compag.2021.106186
[11] ZHAO L, LU L, LIU H, et al. A one-dimensional transient temperature prediction model for Chinese assembled solar greenhouses[J]. Computers and Electronics in Agriculture, 2023, 215: 108450. DOI: 10.1016/j.compag.2023.108450
[12] MAO C, SU Y. CFD based heat transfer parameter identification of greenhouse and greenhouse climate prediction method[J]. Thermal Science and Engineering Progress, 2024, 49: 102462. DOI: 10.1016/j.tsep.2024.102462
[13] SONG X, WANG Z, WANG H. Short-term load prediction with LSTM and FCNN models based on attention mechanisms[J]. Journal of Physics: Conference Series, 2024, 2741(1): 012026. DOI: 10.1088/1742-6596/2741/1/012026
[14] 李莉,李文军,马德新,等. 基于LSTM的温室番茄蒸腾量预测模型研究[J]. 农业机械学报,2021,52(10):369-376. DOI: 10.6041/j.issn.1000-1298.2021.10.038 LI Li, LI Wenjun, MA Dexin et al. Research on greenhouse tomato transpiration prediction model based on LSTM[J]. Transactions of the Chinese Society for Agricultural Machinery, 2021, 52(10): 369-376. (in Chinese with English abstract) DOI: 10.6041/j.issn.1000-1298.2021.10.038
[15] FANG Z, CRIMIER N, SCANU L, et al. Multi-zone indoor temperature prediction with LSTM-based sequence to sequence model[J]. Energy and Buildings, 2021, 245: 111053. DOI: 10.1016/j.enbuild.2021.111053
[16] 胡瑾,雷文晔,卢有琦,等. 基于1D CNN-GRU的日光温室温度预测模型研究[J]. 农业机械学报,2023,54(8):339-346. DOI: 10.6041/j.issn.1000-1298.2023.08.033 HU Jin, LEI Wenye, LU Youqi, et al. Research on temperature prediction model of solar greenhouse based on 1D CNN-GRU[J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54(8): 339-346. (in Chinese with English abstract) DOI: 10.6041/j.issn.1000-1298.2023.08.033
[17] YANG Y, GAO P, SUN Z, et al. Multistep ahead prediction of temperature and humidity in solar greenhouse based on FAM-LSTM model[J]. Computers and Electronics in Agriculture, 2023, 213: 108261. DOI: 10.1016/j.compag.2023.108261
[18] 张观山,丁小明,何芬,等. 基于LSTM-AT的温室空气温度预测模型构建[J]. 农业工程学报,2024,40(18):194-201. DOI: 10.11975/j.issn.1002-6819.202404199 ZHANG Guanshan, DING Xiaoming, HE Fen, et al. Predicting greenhouse air temperature using LSTM-AT[J]. Transactions of the Chinese Society of Agricultural Engineering, 2024, 40(18): 194-201. (in Chinese with English abstract). DOI: 10.11975/j.issn.1002-6819.202404199
[19] SINGH D R, SINGH M K, CHAURASIA S N, et al. Genetic algorithm incorporating group theory for solving the general travelling salesman problem[J]. SN Computer Science, 5(1): 1075.
[20] HAN H G, A Y, ZHANG L. Adaptive multiobjective particle swarm optimization based on decomposed archive[J]. Acta Electronica Sinica, 2020, 48(7): 1245-1254.
[21] SAHA A. Application of sparrow search swarm intelligence optimization algorithm in identifying the critical surface in slope-stability[J]. Discover Geoscience, 2024, 2: 80. DOI: 10.1007/s44288-024-00070-w
[22] 祖林禄,柳平增,赵妍平,等. 基于SSA-LSTM的日光温室环境预测模型研究[J]. 农业机械学报,2023,54(2):351-358. DOI: 10.6041/j.issn.1000-1298.2023.02.036 ZU Linlu, LIU Pingzeng, ZHAO Yanping, et al. Solar greenhouse environment prediction model based on SSA-LSTM[J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 54(2): 351-358. (in Chinese with English abstract) DOI: 10.6041/j.issn.1000-1298.2023.02.036
[23] 许泽海,赵燕东. 融合物联网多环境参数的茎干水分SSA-BP预测模型[J]. 农业工程学报,2023,39(16):150-159. DOI: 10.11975/j.issn.1002-6819.202304150 XU Zhehai, ZHAO Yandong. SSA-BP model for predicting water contents in stem integrating multiple environmental factors acquired via IoT[J]. Transactions of the Chinese Society for Agricultural Machinery, 2023, 39(16): 150-159. (in Chinese with English abstract) DOI: 10.11975/j.issn.1002-6819.202304150
[24] DIAA S, CEM D, MEHMET K, et al. Hybrid deep learning models for time series forecasting of solar power[J]. Neural Computing and Applications, 2024, 36(16): 9095-9112. DOI: 10.1007/s00521-024-09558-5
[25] HUANG S, LIU Q, WU Y, et al. Edible mushroom greenhouse environment prediction model based on attention CNN-LSTM[J]. Agronomy, 2024, 14(3): 473. DOI: 10.3390/agronomy14030473
[26] MA L, HE C, JIN Y, et al. Tracking control method for greenhouse environment prediction model based on eeal-time optimization error constraints[J]. Applied Sciences, 2023, 13(12): 7151. DOI: 10.3390/app13127151
[27] EDELMANN D, MÓRI T F, SZÉKELY G J. On relationships between the Pearson and the distance correlation coefficients[J]. Statistics & Probability Letters, 2021, 169: 108960.
[28] BENJAMIN L, MÜLLER T, VIETZ H, et al. A survey on long short-term memory networks for time series prediction[C]// Proceedings of the International Conference on Industrial Engineering and Systems Management. Beijing: IEEE, 2021: 1025-1030.
[29] ZOU Z, YAN X, YUAN Y, et al. Attention mechanism enhanced LSTM networks for latency prediction in deterministic MEC networks[J]. Intelligent Systems with Applications, 2024, 23: 200425. DOI: 10.1016/j.iswa.2024.200425
[30] DUAN J, ZUO H, BAI Y, et al. A multistep short-term solar radiation forecasting model using fully convolutional neural networks and chaotic aquila optimization combining WRF-Solar model results[J]. Energy, 2023, 271: 133525.
[31] 祖林禄. 数据驱动的日光温室番茄果实生长预测模型研究 [D]. 泰安:山东农业大学,2023. ZHU Linlu. Research on the data-driven model for predicting tomato growth in solar greenhouse [D]. Taian: Shandong Agricultural University, 2023. (in Chinese with English abstract)
[32] 王罕博,龚道枝,梅旭荣,等. 覆膜和露地旱作春玉米生长与蒸散动态比较[J]. 农业工程学报,2012,28(22):88-94. DOI: 10.3969/j.issn.1002-6819.2012.22.014 WANG Hanbo, GONG Daozhi, MEI Xurong, et al. Dynamics comparison of rain-fed spring maize growth and evapotranspiration in plastic mulching and un-mulching fields[J]. Transactions of the Chinese Society of Agricultural Engineering, 2012, 28(22): 88-94. (in Chinese with English abstract) DOI: 10.3969/j.issn.1002-6819.2012.22.014