Abstract:
Pellet feed products have been widely used in the agricultural industry. Multiple factors can dominate their quality, including the feed formulation, parameters, equipment settings, and environmental conditions. The production line can often be required to produce a variety of feed formulations or product types in practice. However, the parameters of the pellet mill cannot fully meet the large-scale production in recent years. The "trial-and-error" or manual experience has failed to achieve precise control over pellet quality. In this study, a quality prediction was proposed for the pellet feed products using Stacking ensemble learning. Actual production data was collected from two production lines of a feed enterprise. A dataset was then constructed with the 34 input features, including the environmental parameters, feed formulation, equipment parameters, and process parameters. Feature selection was performed using the Random Forest and the Maximal Information Coefficient method. Non-redundant features were then identified as the model inputs. There was a significant influence on the pellet feed quality. Pellet hardness, pellet durability index (PDI), and productivity were defined as the target outputs. Seven machine learning algorithms were employed for the model evaluation, including Random Forest, Support Vector Regression, Extreme Gradient Boosting, Gradient Boosted Decision Trees, Gaussian Process Regression, Backpropagation Neural Network, as well as Least Absolute Shrinkage and Selection Operator (LASSO). Four models were selected with the best performance as the base learners. Ridge Regression was used as the meta-learner. A stacking ensemble prediction model was constructed after evaluation. The results showed that the optimal single models to predict the pellet hardness, pellet durability index (PDI), and productivity were Random Forest, Random Forest, and Gaussian Process Regression, respectively. Compared with the optimal single learners, the stacking ensemble model demonstrated superior performance over all three quality indicators. On the test set, the root mean square error (RMSE) for the pellet hardness, PDI, and productivity decreased from 3.196 2 N, 5.110, and 0.5175 t/h to 2.932 2 N, 4.830, and 0.464 6 t/h, respectively, which was reduced the prediction error by 8.26%, 5.48%, and 10.2%, respectively. The prediction accuracy was improved substantially. A quantitative analysis of the feature contributions was made using the Random Forest. The results revealed that the pellet hardness and PDI were predominantly influenced by the feed formulation, with the cumulative contribution rates of 87.01% and 88.94%, respectively. The top five contributing factors to the pellet hardness were identified as the medium wheat bran, soybean oil, fish meal, standard soybean meal, and limestone. In PDI, the five most influential variables were the wheat middlings, limestone, extruded soybean, soybean oil, and medium wheat bran. In contrast, the productivity was primarily governed by the parameters. Among them, the feeding frequency exhibited a dominant single-factor contribution of 42.94%. The top five factors contributing to the productivity were the feeding frequency, conditioning temperature, ambient temperature, steam pressure, and ambient humidity. A combined feature selection was utilized, including the Random Forest and the Maximum Information Coefficient. A total of 11, 13, and 12 key features were ultimately selected for the pellet hardness, PDI, and productivity, respectively. The key features were identified after optimization. In conclusion, the findings can provide the theoretical support and data references for the precise control of the pellet feed quality. It is of great significance to enhance the intelligent monitoring and precise control of the feed production equipment in pellet mills. Scientific decision-making can support the quality optimization and cost control in the feed industry.