Abstract:
Agricultural production safety is an important component of food safety. The Oryza sativa subsp. japonica Kato. as a daily edible rice, rapid inspection of low-quality mixed seeds is an important research work in related fields. In this study, spectral signals of 220 samples of mixed and pure rice varieties were collected using terahertz time-domain spectroscopy, and the spectral data were preprocessed by Fourier transform(FT), and the time-domain signals were converted into frequence-domain signals as modeling data sets. Five pattern recognition models such as QUSET were compared for sorting research. The results show that random forest(RF) algorithm, successive projections algorithm(SPA), variable combination population analysis-iteratively retaining information variables algorithm(VCPA-IRIV) were selected, and the three algorithms selected 9, 6 and 25 important feature frequencies respectively, in which VCPA-IRIV as the characteristic frequency selected by the coupling algorithm contained the most abundant spectral information. In order to further optimize the model, the modeling after characteristic frequency selection was significantly superior to the full-spectrum modeling method in terms of analysis speed and recognition accuracy. The QUEST and KNN classification based on 25 characteristic frequencies screened by the VCPA-IRIV algorithm could both had 100% identification accuracy. The variable cluster analysis coupled iterative retention algorithm could effectively select the characteristic frequency of terahertz spectrum containing rich information, and could effectively improve the accuracy of the established recognition model. The identification model based on terahertz spectrum and coupled feature selection algorithm was fast and accurate, and could be used for detecting poor quality Oryza sativa subsp. japonica Kato. seeds to offer a new approach.