基于Stacking多模型融合的颗粒饲料质量预测方法

吴俊华; 王粮局; 徐际童; 邹方磊; 王威; 郭绍永; 王红英

doi:10.11975/j.issn.1002-6819.202503226

摘要: 针对颗粒饲料产品质量受饲料配方、工艺参数、设备参数以及环境参数等多重因素影响，导致颗粒饲料质量管控困难的问题，该研究提出一种基于Stacking多模型融合的颗粒饲料质量预测方法。以实际生产线上采集的数据为基础，采用随机森林算法和最大互信息系数进行特征筛选，构建融合多个机器学习算法的Stacking预测模型。结果表明，Stacking多模型融合算法优于单一机器学习算法，预测的颗粒硬度、颗粒耐久性指数（pellet durability index，PDI）及生产率在测试集上的均方根误差分别是2.932 N，4.830，0.465 t/h，较各自的最优单一模型分别降低了8.26%、5.48%和10.20%；进一步采用随机森林算法量化特征贡献度发现，颗粒硬度和PDI主要受饲料配方因素主导，累计贡献率分别为87.01%和88.94%；生产率主要由喂料频率决定，贡献率为42.94%。该研究为颗粒饲料质量的精准管控提供了一种新的技术方法，为提高饲料生产设备智能化水平、精细化技术水平提供了一定的理论依据。

Abstract: Pellet feed products have been widely used in the agricultural industry. Multiple factors can dominate their quality, including the feed formulation, parameters, equipment settings, and environmental conditions. The production line can often be required to produce a variety of feed formulations or product types in practice. However, the parameters of the pellet mill cannot fully meet the large-scale production in recent years. The "trial-and-error" or manual experience has failed to achieve precise control over pellet quality. In this study, a quality prediction was proposed for the pellet feed products using Stacking ensemble learning. Actual production data was collected from two production lines of a feed enterprise. A dataset was then constructed with the 34 input features, including the environmental parameters, feed formulation, equipment parameters, and process parameters. Feature selection was performed using the Random Forest and the Maximal Information Coefficient method. Non-redundant features were then identified as the model inputs. There was a significant influence on the pellet feed quality. Pellet hardness, pellet durability index (PDI), and productivity were defined as the target outputs. Seven machine learning algorithms were employed for the model evaluation, including Random Forest, Support Vector Regression, Extreme Gradient Boosting, Gradient Boosted Decision Trees, Gaussian Process Regression, Backpropagation Neural Network, as well as Least Absolute Shrinkage and Selection Operator (LASSO). Four models were selected with the best performance as the base learners. Ridge Regression was used as the meta-learner. A stacking ensemble prediction model was constructed after evaluation. The results showed that the optimal single models to predict the pellet hardness, pellet durability index (PDI), and productivity were Random Forest, Random Forest, and Gaussian Process Regression, respectively. Compared with the optimal single learners, the stacking ensemble model demonstrated superior performance over all three quality indicators. On the test set, the root mean square error (RMSE) for the pellet hardness, PDI, and productivity decreased from 3.196 2 N, 5.110, and 0.5175 t/h to 2.932 2 N, 4.830, and 0.464 6 t/h, respectively, which was reduced the prediction error by 8.26%, 5.48%, and 10.2%, respectively. The prediction accuracy was improved substantially. A quantitative analysis of the feature contributions was made using the Random Forest. The results revealed that the pellet hardness and PDI were predominantly influenced by the feed formulation, with the cumulative contribution rates of 87.01% and 88.94%, respectively. The top five contributing factors to the pellet hardness were identified as the medium wheat bran, soybean oil, fish meal, standard soybean meal, and limestone. In PDI, the five most influential variables were the wheat middlings, limestone, extruded soybean, soybean oil, and medium wheat bran. In contrast, the productivity was primarily governed by the parameters. Among them, the feeding frequency exhibited a dominant single-factor contribution of 42.94%. The top five factors contributing to the productivity were the feeding frequency, conditioning temperature, ambient temperature, steam pressure, and ambient humidity. A combined feature selection was utilized, including the Random Forest and the Maximum Information Coefficient. A total of 11, 13, and 12 key features were ultimately selected for the pellet hardness, PDI, and productivity, respectively. The key features were identified after optimization. In conclusion, the findings can provide the theoretical support and data references for the precise control of the pellet feed quality. It is of great significance to enhance the intelligent monitoring and precise control of the feed production equipment in pellet mills. Scientific decision-making can support the quality optimization and cost control in the feed industry.

基于Stacking多模型融合的颗粒饲料质量预测方法

Pellet feed quality prediction method based on Stacking multi-model fusion