Abstract:
Aiming to address the problems of high complexity, cumbersome calculation processes, and insufficient accuracy commonly existing in vessel gross tonnage calculation, this study proposes a modeling method that integrates multiple vessel characteristic parameters with multi-algorithm optimization. Initially, relevant feature variables were mined based on the principal ship dimension parameters. The Pearson correlation coefficient was used to conduct a correlation analysis on the selected feature variables, obtaining the Pearson correlation coefficients between the feature variables and the gross tonnage, which helped determine their correlation. Based on these feature variables, three types of nonlinear regression models for gross tonnage were constructed: the linear multiplicative model, the sub-item exponential model, and the hybrid model. These three models were utilized to analyze the nonlinear relationship between the feature variables and the ship's gross tonnage. Subsequently, the dataset of 1,913 fishing vessels in the South China Sea region was divided into training and testing sets in a ratio of 8:2. Specifically, the dataset included 448 trawlers, 479 purse seiners, 440 gillnetters, 237 setnetters, and 309 longliners. Regression prediction of the vessel gross tonnage was performed using both Backpropagation neural network (BPNN) and Random forest (RF). Model fitting and validation analysis were conducted using nonlinear least squares and Particle swarm optimization (PSO) algorithms. To verify the robustness of the models, robustness tests with different noise intensities were carried out on different models under various noise intensities of 3%, 5%, 10%, and 20%. Ultimately, experimental validation was conducted using the data of 1,913 fishing vessels in the South China Sea region. The results showed that the Pearson correlation coefficients between the feature variables (length L, beam B, and draft D) and the gross tonnage GT were 0.9377, 0.8204, and 0.9327, respectively. These values, being all close to 1, indicate a strong correlation between length L, breadth B, depth D, and gross tonnage GT. Therefore, length L, breadth B, depth D were selected as the feature variables. In the regression prediction of the gross tonnage of fishing vessels, Random forest outperformed Backpropagation neural network in various error metrics,
including mean absolute error (MAE), root mean squared error (RMSE), mean absolute percentage error (MAPE), and the coefficient of determination (R
2), indicating a higher prediction accuracy. The nonlinear least squares method outperformed the Particle swarm optimization (PSO) algorithm in various evaluation metrics, including mean bias error (MBE), mean absolute error (MAE), and mean absolute percentage error (MAPE), as well as in computational efficiency. The hybrid gross tonnage prediction model outperformed both the linear multiplicative model and the sub-item exponential model in various evaluation metrics,
encompassing MAE, RMSE, MAPE, and the R
2. Moreover, it demonstrated significantly better interference resistance under different noise intensities compared to the linear product model and the sub-item exponential model. The model's mean bias error (MBE) was -1.2444, its root mean squared error (RMSE) was 32.0362, its mean absolute error (MAE) was 24.481, its mean absolute percentage error (MAPE) was 9.94%, and its coefficient of determination (R
2) was 0.9619. These results indicate that the model has good generalization ability and robustness. Building on the proposed modeling method for calculating vessel gross tonnage that integrates feature engineering with multi-algorithm optimization, the study further conducted an in-depth analysis and comparison of actual ship gross tonnage, the International gross tonnage measurement formula, and the model predictions. The results demonstrated that the hybrid gross tonnage prediction model achieved higher accuracy than the international GT measurement formula, thereby validating the effectiveness of the model. By employing artificial intelligence and machine learning methods, this study optimized the calculation model for the gross tonnage of fishing vessels, significantly improving the accuracy of the calculations. This achievement provides valuable references for the digital and intelligent design and management of fishing vessels and helps promote technological progress and management innovation in the fisheries sector.