Abstract:
Accurate county-level maize yield prediction was essential for regional crop monitoring and agricultural management. However, existing sequential models often failed to represent continuous local temporal changes and whole-season global dependencies simultaneously. This study aimed to develop a deep learning model that could integrate multi-source agricultural information and improve maize yield prediction accuracy, early prediction ability, and spatial generalization at the county scale. A Dual-Branch Global and Local Feature Fusion Network (DGLF-Net) was developed for maize yield prediction in Jilin Province from 2009 to 2022. The model integrated Moderate Resolution Imaging Spectroradiometer (MODIS) surface reflectance, vegetation indices, meteorological data, and soil properties. A Local Temporal Feature Extractor (LTFE), composed of a Convolutional Neural Network (CNN) and a Long Short-Term Memory (LSTM) network, was used to extract local continuous temporal features. A Global Temporal Attention Module (GTAM) was used to capture long-range temporal dependencies across the growing season. A gated fusion module (Gating) was then used to combine local temporal information and global contextual information. Data from 2009 to 2020 were used for training, and data from 2021 to 2022 were used for testing. The correlation analysis showed that maize yield formation was affected by both whole-season cumulative effects and key-month local dynamics. The enhanced vegetation index (EVI) in July showed the strongest positive correlation with yield, with a correlation coefficient of 0.51, indicating that remote sensing information during key growth stages was important for yield prediction. Ablation experiments confirmed the contribution of the main components of DGLF-Net. In 2022, the complete model achieved a coefficient of determination (
R2) of 0.812 0, a root mean squared error (RMSE) of 441.95 kg/hm
2, and a mean absolute error (MAE) of 325.30 kg/hm
2. After removing GTAM and Gating,
R2 decreased to 0.711 7 and 0.773 9, respectively. Within LTFE, removing CNN reduced
R2 to 0.605 3, while removing LSTM reduced
R2 to 0.745 2, demonstrating that local feature extraction and temporal dependency modeling both contributed to prediction performance. Model comparison experiments showed that DGLF-Net outperformed random forest regression (RFR), extreme gradient boosting (XGBoost), CNN, Transformer, LSTM, CNN-Transformer, and CNN-LSTM-Attention in both 2021 and 2022. Compared with CNN-LSTM-Attention in 2022, DGLF-Net improved
R2 by 5.52% and reduced RMSE and MAE by 9.69% and 17.38%, respectively. Spatial independent validation showed that DGLF-Net maintained stable performance in counties with different yield levels. In 2022, the prediction accuracies for Ji’an, Linjiang, and Gongzhuling were 96.10%, 97.47%, and 95.08%, respectively. Early prediction analysis showed that the model reached an
R2 of about 0.77 using data from May to September, indicating that reliable prediction could be achieved one month before harvest. The vegetation index combination analysis showed that the combination of NDVI, CVI, and green normalized difference vegetation index (GNDVI) performed best in 2022, with an
R2 of 0.826 0, RMSE of 425.30 kg/hm
2, and MAE of 309.40 kg/hm
2. SHapley Additive exPlanations (SHAP) analysis indicated that soil organic carbon (SoC), bulk density (BulkDensity), and vapor pressure deficit (Vpd) were stable important features in both years. Spatial prediction maps further showed that DGLF-Net better preserved the county-level yield gradient in Jilin Province and produced a more balanced spatial error distribution. The proposed DGLF-Net effectively improved county-level maize yield prediction by jointly learning local temporal dynamics and global seasonal dependencies. It showed strong prediction accuracy, early prediction potential, and spatial generalization. Future studies could further incorporate higher-resolution remote sensing data, planting density, field management information, cultivar differences, and pest or disease information to improve model applicability in more complex agricultural scenarios.