地理相似性在农作物遥感分类采样中的应用

赵凡; 张新; 董文; 郜丽静; 彭玉; 张冬

doi:10.11975/j.issn.1002-6819.202508050

摘要: 样本代表性是制约遥感农作物分类精度与可靠性的关键因素，尤其在需要详细规划、样本规模受限的县域尺度研究中更为突出。现有采样方法多依赖随机、均匀布点或经验分区，难以充分刻画作物光谱差异及其随样本规模变化的影响。针对上述问题，该研究以县域农作物分类为试验场景，构建了一种基于地理相似性的样本选取策略，并以分层随机抽样和系统抽样作为对照方法，结合支持向量机、随机森林和时间卷积网络3类模型开展农作物分类对比试验，在不同样本规模条件下系统评估其分类性能与样本代表性。试验结果表明，在42样本条件下，地理相似性采样能够以较少样本实现对特征空间的有效覆盖，其分类精度在支持向量机和时间卷积网络中较对照方法提升约2%~12%，在随机森林中与其他采样策略差异较小。随着样本数量增加，3种采样策略间的精度差异逐渐减小，表明地理相似性采样的优势主要体现在样本数量较少的阶段。进一步分析发现，不同地理相似性采样的作物分类效果不同：光谱特征清晰的作物在少量样本下即可获得较高精度，而光谱异质性较强或混合程度较高的作物的精度随样本量增加而提升并趋于稳定。

Abstract: Accurate crop type classification from remote sensing imagery is often required in precision agriculture. Its reliability can depend mainly on the representative training samples, especially on the county scale, where sample acquisition is costly due to the limited sample size. Conventional sampling strategies primarily emphasize spatial uniformity or prior zoning, such as random sampling, systematic (grid) sampling, or empirical stratification. But they cannot capture crop spectral variability and its interaction with sample size. In this study, geographical similarity was applied to the sampling strategy in feature space. County-level crop classification was also taken as an experimental scenario. Stratified random and systematic sampling were adopted as the baseline. Three classification models—support vector machine (SVM), random forest (RF), and temporal convolutional network (TCN)—were employed to conduct comparative experiments under multiple sample-size conditions. Model performance was evaluated on the classification accuracy and sample representativeness. Experimental results show that the sampling strategies performed best under the condition of 42 training samples. The similarity sampling strategy was achieved in the effective coverage of the feature space with fewer samples. Higher accuracy was also achieved than the baseline. Specifically, classification accuracy was improved by approximately 2%-12% in the SVM and TCN models, while the differences among sampling strategies in the RF model remained relatively small. Accuracy differences among the sampling strategies gradually reduced as sample size increased, indicating that the similarity sampling is suitable for the early stage with limited samples. Furthermore, the sampling effectiveness depended mainly on the crop-specific spectral heterogeneity. Crops with stable spectral signatures were achieved in high classification accuracy with limited samples. Whereas the higher spectral heterogeneity or mixed spectral behavior benefited more from similarity sampling. Sample representativeness was also enhanced to expand the coverage of the feature space. The similarity sampling was particularly suitable for the complex crop classification tasks under limited sampling. A relatively flat plain with low environmental heterogeneity can constrain the full potential of similarity sampling under more complex environmental gradients. It is often required to apply to the mountainous or highly heterogeneous regions. In addition, the explicit indicator can also be used for the classification accuracy and the quantitative relationship between sample representativeness and environmental complexity. Future work can incorporate the explicit representative metrics and the coupling mechanisms among sample representativeness, landscape heterogeneity, and accuracy. The robustness of sampling strategies can be further improved for crop classification using remote sensing.

地理相似性在农作物遥感分类采样中的应用

Application of geographic similarity in sampling for crop remote sensing classification