Abstract:
OBJECTIVE Real-time monitoring of moisture content during the drying process of peas is crucial for ensuring the quality of finished products and extending shelf life. Traditional detection methods, such as the oven-drying method, are destructive, time-consuming, and severely lagging, making them difficult to apply in modern online industrial monitoring. To address these issues, this study proposed a high-precision and lightweight non-destructive detection strategy for pea moisture content by utilizing Visible and Near-Infrared (Vis-NIR) hyperspectral imaging technology. Compared with traditional single-point near-infrared spectroscopy, hyperspectral imaging avoids the background noise caused by irregular shrinkage during pea drying and provides a theoretical basis for the physical-chemical coupling mechanism. METHOD Three hundred and sixty pea samples (variety "Zhongwan No. 6") were prepared and subjected to hot air drying treatment at 60 °C. To construct a dataset with a representative moisture gradient, samples were collected at nine distinct time intervals covering a complete variation range from a fresh high-moisture state to a completely dried low-moisture state. Hyperspectral images of all samples were acquired in the spectral range of 372.66 to 1039.65 nm. A threshold segmentation method (reflectance threshold of 0.15 to 1.0 at 857.53 nm) was applied to automatically extract the region of interest (ROI), completely avoiding human subjective errors. Following image acquisition, the raw spectral data were preprocessed using the Standard Normal Variate (SNV) method to eliminate optical path differences. The dataset was partitioned into a calibration set and a prediction set at a ratio of 3:1 using the Kennard-Stone algorithm. Furthermore, four feature wavelength screening algorithms, including Least Absolute Shrinkage and Selection Operator (LASSO), Bootstrapping Soft Shrinkage (BOSS), Particle Swarm Optimization (PSO), and Competitive Adaptive Reweighted Sampling (CARS), were rigorously compared. These feature selection methods were combined with Partial Least Squares Regression (PLSR), Least Squares Support Vector Machine (LSSVM), Categorical Boosting (CatBoost), and lightweight One-Dimensional Convolutional Neural Networks (1D-CNN) to construct comprehensive prediction models. RESULTS Spectral response analysis revealed that the drying process induced a significant downward baseline drift across the entire band. According to the Kubelka-Munk theory, this phenomenon was attributed to the physical shrinkage and increased surface roughness of the peas, which dramatically enhanced the diffuse scattering effect. Experimental results indicated that the CARS algorithm exhibited the optimal feature dimensionality reduction performance. It successfully screened out a subset of ten characteristic wavelengths, accounting for only 5.7% of the full spectrum. An ablation study on feature wavelengths was specifically conducted, revealing that CARS retained not only the high-correlation water absorption bands (e.g., around 970 nm) but also several low-correlation baseline reference points (e.g., 504.14 nm and 1 035.63 nm). When the low-correlation baseline reference points were artificially removed in the ablation study, the prediction coefficient of determination (
Rp2) of the model dropped drastically to 0.7155. This quantitatively proved that these variables effectively compensated for the physical baseline drift caused by drying shrinkage through an implicit mathematical difference mechanism. Among all established combinations, the CARS-LSSVM non-linear model achieved the best overall performance, yielding an
Rp2 of 0.964 8 and a root mean square error of prediction (RMSEP) of 0.047 7. This accuracy was superior to the PSO-LSSVM model, which retained 77 features. Furthermore, the CARS-LSSVM model demonstrated a remarkable advantage in computational efficiency; its running time was recorded at 0.042 seconds on the same hardware platform, which was hundreds of times faster than the deep learning model 1D-CNN. CONCLUSION The proposed strategy effectively extracts pure chemical information by decoupling the physical-chemical coupling effects during the drying process. By employing the CARS algorithm to mine a minimalist feature subset combined with the LSSVM model, a robust solution for pea moisture content detection was successfully achieved. This study provides solid data support and theoretical guidance for the future development of low-cost, low-power, and handheld multi-spectral intelligent sensors.