Abstract:
Soil Organic Matter (SOM) plays a vital role in enhancing soil structure, increasing fertility, supporting carbon sequestration, and facilitating nutrient cycling. However, conventional chemical analysis methods, while accurate, suffer from significant drawbacks including lengthy processing times, operational complexity, high costs, and the destruction of soil samples. These limitations make them impractical for the rapid, large-scale monitoring required in modern precision agriculture. To achieve portable, rapid, and non-destructive detection of SOM content in agricultural field environments, this study developed a SOM detection device based on multispectral technology. This study utilized 237 soil samples collected from Ningxia and Shaanxi provinces. Visible-near infrared (Vis-NIR) spectral data and reference SOM content values were obtained. After completing spectral preprocessing and outlier removal, four feature wavelength selection algorithms-namely moving window PLS (MWPLS), iterative random forests (iRF), variable dimension particle swarm optimization based on combined moving window (VDPSO-CMW), and moving window smoothing on the ensemble of competitive adaptive reweighted sampling (MWS-ECARS)-were employed in conjunction with the variable importance in Projection (VIP) method. This integrated analytical process successfully identified seven characteristic wavelengths for SOM: 420, 530, 600, 630, 855, 900, and
1345 nm, thereby significantly reducing the spectral dimensionality and complexity of the dataset. Furthermore, to mitigate the interference of soil moisture on spectral signals, a dedicated correction wavelength at
1450 nm was also selected for soil moisture adjustment. Building upon this foundation, the hardware and software systems of the device were designed based on the selected characteristic wavelengths and the ESP32 embedded platform. The device integrates a main control module, a multispectral acquisition module, a power management module, and a human-machine interaction module. The acquisition of multispectral data is achieved using narrow-band LED light sources corresponding to the characteristic wavelengths, in conjunction with photodiodes. Subsequently, the developed device was used to collect soil multispectral data. Based on three machine learning algorithms-partial least squares (PLS), multilayer perceptron (MLP), and support vector machine (SVM)-three modelling strategies, namely dry soil modelling, global modelling, and stratified modelling, were respectively compared and constructed. The results indicated that the stratified modelling strategy yielded the best predictive performance. This approach involves establishing independent SOM prediction models for each moisture gradient. During actual prediction, the soil moisture content is first estimated. Based on the determined moisture gradient, the corresponding sub-model is then selected to predict the SOM content, thereby enabling accurate SOM estimation under varying soil moisture conditions. Utilizing the multispectral data acquired by the prototype device, a PLS-MLP stratified regression model was constructed. This involved using the PLS algorithm for moisture gradient classification and then applying the MLP algorithm to estimate SOM content. The model achieved a coefficient of determination (R
2) of 0.84, a root mean square error (RMSE) of 3.93 g/kg, and a relative prediction deviation (RPD) of 2.50. After embedding the model into the device for testing, the R
2 between the device's estimated values and laboratory-measured values reached 0.82, with an RMSE of 5.27 g/kg, an RPD of 1.64, a standard deviation for repeated tests of less than 0.4 g/kg, and a single detection time under 9 seconds. Therefore, this device enables rapid and accurate estimation of SOM content. It provides a valuable technical concept for the development of rapid soil fertility assessment equipment, demonstrating strong application potential.