Abstract:
The neon flying squid (
Ommastrephes bartramii) is an economically important cephalopod species in the Northwest Pacific Ocean, whose spatial distribution and abundance are synergistically regulated by multiple marine environmental factors. However, traditional modeling approaches face significant limitations in addressing such challenges: they are constrained by the inherent sparsity and zero-inflation problems of fishery catch data, leading to distortions in initial information representation; moreover, these models struggle to accurately capture the complex nonlinear relationships between environmental conditions and resource dynamics, thereby limiting high-precision prediction and mechanistic interpretation. To overcome these limitations, this study developed a novel deep learning framework that integrates multi-source environmental elements for simultaneous prediction and intelligent interpretation of squid resources. Methodologically, we integrated multi-annual (2015–2019) satellite-derived environmental data and fishery catch records. The proposed framework consists of three core modules:An enhanced TimeXer-based prediction module was constructed by incorporating a cross-attention mechanism to effectively capture complex interdependencies between multi-source environmental variables—including sea surface temperature (SST), chlorophyll-a concentration (Chl-a), salinity, dissolved oxygen (DO), pH, and sea surface height (SSH) at multiple depths (0 m, 100 m, 200 m, and 300 m)—and historical Catch Per Unit Effort (CPUE) sequences. An adaptive gating mechanism was further employed to dynamically fuse these heterogeneous information streams, significantly improving CPUE prediction accuracy.A hierarchical classification module based on LightGBM was implemented to mitigate data imbalance issues. This module performs a two-stage classification: first distinguishing fishing from non-fishing areas, followed by fine-grained categorization of resource abundance levels (Few, Little, Mid, Most) within identified fishing zones.An interpretability module leveraging SHAP (SHapley Additive exPlanations) values was integrated to quantitatively assess the marginal contributions and interaction effects of each environmental factor, enabling a transparent and systematic analysis of key drivers underlying squid distribution patterns.Experimental results demonstrated the superior performance of our framework. On an independent test set, it achieved an R
2 of
0.9722 and a Mean Squared Error (MSE) of
0.0308, significantly outperforming several state-of-the-art deep learning models (e.g., Transformer, Crossformer, FEDformer) across multiple metrics (MSE, RMSE, MAE, MRE). SHAP analysis identified chlorophyll-a concentration (optimal range: 0.1–0.5 mg/m
3), sea surface salinity (33.0–33.5 PSU), sea surface temperature (15.0–22.0 ℃), dissolved oxygen, and sea surface height as the primary environmental drivers governing the formation of high-catch areas. Crucially, SHAP interaction analysis uncovered significant synergistic effects, particularly between vertical salinity stratification and latitude, chlorophyll-a and salinity, and salinity and dissolved oxygen, revealing a complex "dynamic–nutrient–physicochemical" mechanism underpinning squid aggregation.This study provides an interpretable, high-precision paradigm for forecasting squid resource dynamics and deciphering the synergistic driving mechanisms of multi-environmental factors. The framework not only offers a robust tool for sustainable fishery management but also delivers profound insights into the ecological habits of O. bartramiiunder rapid environmental change, supporting scientific decision-making for the conservation and utilization of oceanic fishery resources.