高级检索+

基于改进SegFormer的无人机遥感影像小麦冠层分割与高度估计

Wheat canopy segmentation and height estimation in UAV remote sensing images based on improved SegFormer

  • 摘要: 株高是表征作物生长状况、产量预测及进行精准管理的重要表型性状,小麦的株高监测在田间精准管理和育种栽培中具有重要科学与应用价值。该研究基于SegFormer模型,提出了一种基于双分支特征编码和多尺度特征融合解码的小麦冠层分割方法,并基于分割掩膜构建小麦高度自动估计系统。首先,在分割模型的编码器部分,采用CNN细节分支与Transformer语义分支并联结构,增强局部纹理特征和全局上下文特征的协同表达能力。其次,在解码器部分,通过逐步上采样与跳跃连接机制,实现对多尺度特征的高效融合,有效提升空间信息恢复能力与边界细节建模能力。最后,结合分割结果和作物高度模型,对180个样本小区从分蘖期至开花期的小麦冠层高度进行了估计与验证。试验结果表明,改进模型在分割任务中平均交并比(mean intersection over union,mIoU)、平均像素准确率(mean pixel accuracy,mPA)和像素准确率(pixel accuracy,PA)分别达到80.92%、89.42%和90.07%,均优于原始SegFormer模型。高度估计结果与田间实测值的决定系数(coefficient of determination,R2)为0.985,均方根误差(root mean square error,RMSE)为0.73cm,相对均方根误差(relative RMSE,rRMSE)为2.41%。研究结果表明,该研究方法能够高效、准确地提取小麦冠层空间分布及株高信息,为冬小麦生长动态监测、田间表型研究及精准农业应用提供了可靠的技术支撑。

     

    Abstract:
    Plant height is a key phenotypic trait that reflects crop growth status, supports yield prediction, and guides precision management. Monitoring wheat plant height holds significant scientific and practical value for field-based precision agriculture and breeding cultivation. However, traditional field-based plant height measurements are often labor-intensive, time-consuming, and prone to human error, limiting their scalability and consistency in large-scale agricultural settings.
    In this study, we propose an automated wheat height estimation framework that integrates semantic segmentation across the entire growth stages. The approach utilizes multi-temporal and multi-view UAV RGB images, combined with Structure from Motion (SfM)-based 3D reconstruction to generate Digital Surface Models (DSM) and Digital Terrain Models (DTM). The Crop Height Model (CHM) is then derived by differencing DSM and DTM. Meanwhile, an improved SegFormer-based semantic segmentation model is employed to accurately extract wheat canopy regions from field images, effectively eliminating background noise such as soil and non-vegetation elements. Based on the resulting segmentation masks and CHM, an automatic height inversion system is established to achieve accurate and efficient estimation of wheat canopy height. Specifically, the masks are used to isolate canopy regions in the CHM, and the 95th percentile of height values within each connected vegetation cluster is used to represent its canopy height. These values are aggregated at the plot level to derive average wheat height at each growth stage, enabling reliable phenotypic analysis and temporal monitoring. In the segmentation model’s encoder, a parallel structure combining a CNN-based detail branch and a Transformer-based semantic branch is employed to enhance the synergistic representation of local texture features and global contextual information. The CNN branch captures subtle edge structures and local texture variations, particularly critical during early growth stages when canopies are sparse and fragmented. In contrast, the Transformer branch encodes long-range dependencies and semantic context, enabling robust representation of large-scale canopy structures. In the decoder, progressive upsampling and skip connections are combined with a feature fusion module to effectively integrate multi-scale features, thereby improving the preservation of spatial information and the refinement of boundary details. To address the challenge of effectively combining heterogeneous features from the dual-branch encoder and multi-scale decoder, an Aggregation Layer is introduced as a dedicated feature fusion module. This module employs a combination of point-wise multiplication and addition to enhance feature complementarity between local and global representations. Additionally, it incorporates convolutional refinement, normalization, and channel recalibration mechanisms to improve the stability and semantic consistency of the fused output. Experimental results demonstrate that the proposed model achieves mean intersection over union (mIoU), mean pixel accuracy (mPA), and pixel accuracy (PA) values of 80.92%, 89.42%, and 90.07%, respectively, outperforming the original SegFormer model. The estimated canopy heights show a strong correlation with field measurements, with a coefficient of determination (R2) of 0.985, a root mean square error (RMSE) of 0.73cm, and a relative RMSE (rRMSE) of 2.41%. In addition, the method successfully captures the temporal dynamics of wheat growth, revealing consistent height accumulation trends across stages. These results demonstrate that the proposed method can efficiently and accurately retrieve spatial distribution and plant height information of wheat canopies, providing reliable technical support for dynamic monitoring of winter wheat growth, field phenotyping, and precision agriculture applications.

     

/

返回文章
返回