Abstract:
Rapid monitoring of dissolved organic carbon (DOC) content in frozen black soil is crucial for assessing soil fertility under different conditions, understanding nutrient migration and transformation, and elucidating carbon cycling processes. Hyperspectral remote sensing has been widely applied in soil organic matter monitoring due to its advantages of rapidity and efficiency. However, soil freezing alters spectral characteristics, and the accuracy of inversion models under frozen conditions remains unclear. This study established a hyperspectral inversion model for DOC content in frozen black soil and systematically compared the accuracy differences of models under frozen versus unfrozen states across various moisture gradients. Typical seasonally frozen black soil from Heshan Farm in Heilongjiang Province was selected, and random samples with three DOC levels were collected. After measuring the DOC content, pure water was added at different gradients, ultimately obtaining 120 experimental samples with moisture contents ranging from 3.57% to 30.14% and DOC concentrations between 171.80 mg/kg and 322.75 mg/kg. Surface hyperspectral reflectance under unfrozen and frozen states was collected using an AvaField-3 field spectroradiometer. Five spectral preprocessing methods were applied: raw spectral reflectance (REF), first-order differential reflectance (FDR), second-order differential reflectance (SDR), logarithm of reciprocal (LR) and standard normal variable (SNV). Variable importance in projection (VIP) was used to select sensitive bands, and a one-dimensional convolutional neural network (1D-CNN) was employed for feature extraction. The dataset was partitioned using the Kennard-Stone (KS) algorithm with a validation ratio of 0.33, resulting in 80 samples for model calibration and 40 for validation. Four machine learning models backpropagation neural network (BPNN), random forest (RF), tabular prior-fitted network (TabPFN) and extreme gradient boosting (XGBoost) were used to construct DOC inversion models. The models were further divided into low and high moisture gradient categories to select the optimal model. Model performance was evaluated using the coefficient of determination for calibration (R
c2), coefficient of determination for prediction (R
p2), root mean square error (RMSE) and ratio of performance to deviation (RPD). The results indicated that: 1) All four models effectively predicted soil DOC content. The accuracy ranking under both unfrozen and frozen states was consistent: RF > XGBoost > TabPFN > BPNN. The RF-FDR model demonstrated the best performance (frozen: R
c2 = 0.867, R
p2 = 0.851, RMSE = 13.095 mg/kg, RPD = 2.410; unfrozen: R
c2 = 0.824, R
p2 = 0.808, RMSE = 14.830 mg/kg, RPD = 2.123), while the BPNN-LR model showed the worst performance (frozen: R
c2 = 0.778, R
p2 = 0.742, RMSE = 17.200 mg/kg, RPD = 1.889; unfrozen: R
c2 = 0.721, R
p2 = 0.687, RMSE = 18.945 mg/kg, RPD = 1.670). 2) Changes in soil freezing status significantly affected the accuracy of DOC inversion across models. Comparative analysis revealed that the optimal preprocessing methods remained consistent before and after freezing. However, freezing reduced the inversion accuracy of all models, with the most substantial decline observed in BPNN (R
p2 and RPD decreased by 7.36% and 11.62%, respectively), and the least in XGBoost (R
p2 and RPD decreased by 4.70% and 7.25%, respectively). 3) Moisture content gradients considerably influenced model accuracy. Under unfrozen conditions, high-moisture models outperformed low-moisture models, with R
p2 and RPD higher by 0.06~0.09 and 0.20~0.40, respectively. Under frozen conditions, the accuracy of both low- and high-moisture models decreased, with a more pronounced reduction in low-moisture models (R
p2 and RPD decreased by 16~18% and 13~19%, respectively). The RF-FDR model developed in this study provides technical support for hyperspectral monitoring of DOC content in seasonally frozen black soil.