基于深度强化学习的振动筛面水平姿态控制

李鑫宇; 张亚男; 金明志; 薛臻; 谭俊; 赵湛

doi:10.11975/j.issn.1002-6819.202510033

基于深度强化学习的振动筛面水平姿态控制

Horizontal attitude control method for vibrating screen surface using deep reinforcement learning

摘要

摘要: 为了提高物料在不均匀喂入情况下的振动筛分作业性能，该研究采用Double Deep Q-Network（DDQN）深度强化学习框架，提出了振动筛面水平姿态的自适应控制方法。以水稻籽粒振动筛分为研究对象，通过离散元法建立了多自由度振动筛分动力学模型，仿真获取了物料在振动筛面的筛分、输运和透筛过程。提出了物料在筛面分布均匀性的评价指标，分析获得了物料在筛面的分布均匀性与筛分性能的关联性，分析了物料喂入分布状态和筛面水平姿态角对筛分性能的影响，确定了物料在筛面分布均匀系数的合理范围。合理选择振动筛面下方一定区域的物料透筛速率作为测量对象，建立BP神经网络用于预测物料在振动筛面的分布均匀性，训练后的神经网络具有良好的收敛性和泛化能力。在DDQN框架下确定了振动筛分作业的动作空间和状态空间，构建了由密集项和稀疏项构成的新型奖励函数，优选了模型的主要超参数，设计了基于DDQN的振动筛面水平姿态控制模型。采用离散元仿真数据集构建交互环境，对DDQN模型进行在线训练，所建立的模型体现出很好的收敛性。将训练完成后的DDQN模型应用于多自由度混联振动筛分控制系统，并进行了水稻籽粒振动筛分对比试验，结果表明：物料在均匀喂入情况下，理想的筛面水平姿态角为0°，所提出的筛面水平姿态控制方法与传统的固定筛面水平姿态角的损失率没有显著差异；随着物料喂入均匀系数从0增加到0.375，固定筛面水平姿态角筛分的损失率从0.59 %迅速增加到0.98 %；所提出的控制方法可以自适应调节振动筛面水平姿态角，从而改变物料在筛面的运动方向，提高物料在振动筛面的分布均匀性，损失率仅增加到约0.68 %，相比降低了约30.6 %，从而验证了深度强化学习在振动筛姿态控制系统中的可行性与优势。

Abstract: To improve the performance of vibrating screening operations when the materials are dynamic feeding conditions, a novel horizontal attitude control method for vibrating screen surface is proposed using the double deep Q-Network (DDQN) deep reinforcement learning framework. Taking the vibrating screening of rice grains as the research object, a multi-degree-of-freedom vibrating screening dynamics model was established using the discrete element method (DEM), and the materials transportation on the screen surface and passing through the screen apertures processes were obtained through DEM simulations. The evaluation index for the uniformity of material distribution on the screen surface was proposed, and the correlations between the materials distribution state on the screen surface and the screening performance were obtained. By analysing the influence of the material feeding distribution state and the horizontal attitude angle of the screen surface on the screening performance, the reasonable range of the uniformity coefficient of the material distribution on the screen surface was determined. Some reasonable regions below the vibrating screen surface were determined for monitoring the materials passing through the screen apertures, and a BP neural network was established to achieve real-time prediction of the uniformity of materials distribution on the vibrating screen surface. After training, the established neural network demonstrated excellent convergence and generalization performance. Under the DDQN framework, the action space and state space of the vibrating screening operations were determined, a new reward function composed of dense and sparse terms was constructed. Then, the main hyper-parameters of the DDQN model were optimized, and a horizontal attitude control model of the vibrating screen surface was constructed. The established DDQN model was trained using the DEM simulation dataset, and the model demonstrated excellent convergence. The trained DDQN model was applied to the multi-degree-of-freedom hybrid vibrating screening control system, and the validation tests were carried out on a Multi-DOF vibration screening test rig under different materials feeding conditions. During the validation tests, the horizontal attitude angle of the screen surface was limited within a certain range. When the materials were fed onto the vibrating screen surface uniformly, the ideal horizontal attitude angle of the screen surface was 0°, therefore, there was no significant difference in loss rate between the proposed control method and the traditional method with a fixed horizontal attitude angle. When the material feeding coefficient increasing from 0 to 0.375, the loss rate under the fixed horizontal attitude angle increased rapidly from 0.59% to 0.98%. The proposed control method could adaptively adjust the horizontal attitude angle of the vibrating screen surface, thereby changing the movement direction of the materials on the screen surface. This can effectively improve the uniformity of materials distribution, and the loss rate only increased to approximately 0.68%. The loss rate decreased by approximately 30.6 %. The results of the comparative tests verified the feasibility and advantages of deep reinforcement learning in the control system design of the vibrating screen attitude. The proposed method provides an effective solution for increasing the materials screening performance under dynamic feeding conditions, and has broad application prospects in the field of intelligent control for agricultural screening equipment.

HTML全文

参考文献(27)

施引文献

资源附件(0)