Abstract:
Wheat is one of the most extensively cultivated crops in global food security. Wheat yellow dwarf disease can be caused by the barley yellow dwarf virus complex (BYDVs). The major viral threat can often occur year-round, widespread in the Huang-Huai wheat region. A substantial risk has posed to the quality and yield during production. Infected plants typically exhibit leaf yellowing, stunted growth, delayed maturity, and shriveled grains. Manual detection can be performed on the preliminary diagnoses using visual symptoms. It is often required to provide costly equipment or specialized expertise, thus limiting the large-scale field screening. In this study, DSFNet, a field-scale detection model, was proposed using drone-acquired multispectral imagery, according to the multi-stage development of wheat yellow dwarf disease. Two components were also integrated - dual-stream adaptive synergistic fusion (DASF) and spatiotemporal joint feature mapping (SFFM) - to achieve the precise identification of the diseases. Its encoder was incorporated with the DASF module. convolutional neural networks (CNNs) and Transformers were combined to dynamically balance the local and global information. The CNN branch was focused on fine-grained features to detect the contour and structural variations, such as lesion edges, textures, and colors. In contrast, the Transformer branch employed self-attention mechanisms to capture long-range dependencies. Semantic relationships among regions were obtained to reveal the spatial distribution patterns of disease on a global scale. An adaptive weight allocation in the DASF module was used to fuse CNN local extraction with Transformer global context. Rich feature representations combined spatial details with semantic depth. The decoder integrated the SFFM module into the spatial and frequency-domain information. Spatial features were preserved to incorporate the frequency-domain analysis. The regions with substantial grayscale variation were captured to enhance the multi-scale feature detection. Finally, the multi-scale features were fused with the original features for the precise segmentation via a multi-layer perceptron (MLP). A series of field experiments was conducted on the multispectral image dataset, including multiple growth stages, diverse lighting conditions, and multi-plot samples. DSFNet’s performance was compared against mainstream models, including SegFormer, UNet, DBFormer, SFFNet, DECSNet, DeepLabV3+, and PSPNet. The DSFNet outperformed all baseline models, with a mean pixel accuracy (mPA) of 92.66% and a mean intersection over union (mIoU) of 86.77%. The mPA values were improved by 3.96, 2.21, 3.65, 2.82, 3.61, 0.28, and 1.95 percentage points, with the mIoU gains of 5.77, 5.53, 5.43, 5.23, 4.49, 2.25, and 1.90 percentage points, respectively. The high accuracy of the DSFNet was maintained on an independent test dataset, particularly with the mPA and mIoU values of 89.71% and 82.56%. There were improvements of 6.35-10.16 percentage points over the rest models. The high robustness and exceptional transfer of DSFNet were also achieved under complex and real-world scenarios over the various time zones. Overall, an efficient and scalable solution was presented for the large-scale field detection of wheat yellow dwarf disease. The optimal feature fusion, frequency-spatial interaction, and adaptive semantic modeling can be combined to provide a practical framework in precision agriculture. The finding can also offer technical support for the rapid, accurate, and intelligent monitoring of crop health in the prevention and control of wheat viral diseases.