基于深度强化学习的农田节点数据无人机采集方法

胡洁; 张亚莉; 王团; 望梦成; 兰玉彬; 张植勋

doi:10.11975/j.issn.1002-6819.2022.22.005

基于深度强化学习的农田节点数据无人机采集方法

UAV collection methods for the farmland nodes data based on deep reinforcement learning

摘要

摘要: 利用无人机采集农田传感器节点数据，可避免网络节点间多次转发数据造成节点电量耗尽，近网关节点过早死亡及网络生命周期缩短等问题。由于相邻传感器数据可能存在冗余、无人机可同时覆盖多个节点进行采集等特点，该研究针对冗余覆盖下部分节点数据采集和全节点数据采集，对无人机数据采集的路线及方案进行优化，以减轻无人机能耗，缩短任务完成时间。在冗余覆盖下部分节点数据采集场景中，通过竞争双重深度Q网络算法（Dueling Double Deep Q Network，DDDQN）优化无人机节点选择及采集顺序，使采集的数据满足覆盖率要求的同时无人机能效最优。仿真结果表明，该算法在满足相同感知覆盖率要求下，较深度Q网络（Deep Q Network，DQN）算法的飞行距离缩短了1.21 km，能耗减少27.9%。在全节点数据采集场景中，采用两级深度强化学习联合（Double Deep Reinforcement Learning，DDRL）方法对无人机的悬停位置和顺序进行优化，使无人机完成数据采集任务时的总能耗最小。仿真结果表明，单节点数据量在160 kB以下时，在不同节点个数及无人机飞行速度下，该方法比经典基于粒子群优化的旅行商问题（Particle Swarm Optimization-Traveling Salesman Problem，PSO-TSP）算法和最小化能量飞行控制（Minimized Energy Flight Control，MEFC）算法的总能耗最少节约6.3%。田间试验结果表明，相比PSO-TSP算法，基于DDRL的数据采集方法的无人机总能耗降低11.5%。研究结构可为无人机大田无线传感器节点数据采集提供参考。

Abstract: Abstract: Unmanned Aerial Vehicle (UAV) has been widely used to collect data from the wireless sensor node in fields. Some problems can be solved in this case, such as no network infrastructure in farmland, fast power consumption of multi-hop data forwarding, premature death of nodes near the gateway, and shortened network life cycle. However, the multiple nodes overlapping can often occur during UAVs collection at the same time, due to the possible redundancy of adjacent sensor data. In this study, a UAV data collection method was proposed to plan the node selection, hovering position, and collecting order using improved deep reinforcement learning. The UAV data collection from the sensor nodes was then divided into two scenarios: data collection from the partial nodes under perceptual redundancy coverage, and data collection from all nodes. The optimization was made to save the UAV energy consumption in less mission completion time. The data collection of partial nodes under perceived redundancy coverage was suitable for the relatively high proportion of redundant coverage area among nodes. The UAV energy also failed to complete the data collection tasks of all nodes, indicating the low requirements of data integrity. By contrast, the all-node data collection fully met the high requirement of data integrity. In the scenario of partial node data collection with perceived redundant coverage, the Dueling Double Deep Q Network (DDDQN) was used to select the collection nodes and then plan the collecting order, indicating the high energy efficiency of the UAV with the less redundant data. Simulation results show that the DDDQN presented greater data coverage and lower effective coverage average energy consumption than the Deep Q Network (DQN) under the same configuration. The training process of DDDQN was more stable than that of DQN, particularly for the higher returns at the end of learning. More importantly, the flight distance and energy consumption of the DDDQN were reduced by 1.21 km, and 27.9%, respectively, compared with the DQN. In the scenario of all-node data collection, a Double Deep Reinforcement Learning (DDRL) was proposed to optimize the hovering position and UAV collection sequence, in order to minimize the total energy consumption of the UAV during data collection. A comparison was made on the DDRL with the classical PSO-TSP and MEFC. A systematic evaluation was made to clarify the impact of the UAV flight speed on the total energy consumption and total working time, the impact of different node data loads on the UAV energy consumption, the impact of different flight speeds on the UAV hover collection time, and the impact of the number of sensor nodes on the total energy consumption. The simulation results show that the total energy consumption of the improved model was at least 6.3% less than that of the classical PSO-based Travel Salesman Problem (PSO-TSP), and the Minimized Energy Flight Control (MEFC) under different node numbers and UAV flight speeds, especially at the data load of a single node less than 160 kB. Finally, the flight and hover powers of the quadrotor UAV were tested to determine the packet loss rate and received signal strength of the UAV in the field experiments. The actual field flight experiments were carried out on the DDRL and the data collection of the classical PSO-TSP. Field experiment results show that the DDRL-based data collection was reduced by 11.5% for the total energy consumption of UAV, compared with the PSO-TSP. The DDDQN and DDRL approaches can be expected to provide the optimal energy consumption for the UAVs' data collection of wireless sensor nodes in the field.

HTML全文

参考文献(31)

施引文献

资源附件(0)