Abstract:
The jointing stage is a critical period in maize growth and development, as well as the period with the fastest growth rate of maize, and is extremely sensitive to drought stress. Accurate diagnosis of drought stress at this stage is crucial for optimizing irrigation strategies and ensuring yield stability. Traditional methods for identifying drought stress, such as those based on environmental factors or remote sensing technologies, have limitations including indirect assessment, high equipment costs, and difficulty in reflecting the actual water status of plants. To address these issues, this study proposes an end-to-end three-stage modular diagnostic framework for autumn maize at the jointing stage, integrating image preprocessing, data augmentation, and deep learning modeling.From September to November 2024, field experiments were conducted using the maize variety "Jingnongke 728" at an agricultural experiment base in Beijing, adopting the potted water control method. Visible light canopy images under different water treatments were collected, and a dataset based on soil volumetric water content was constructed with reference to the Meteorological Drought Grade standards. This dataset includes
1338 original images, divided into 5 drought gradients: adequate irrigation, mild drought, moderate drought, severe drought, and extreme drought. After data augmentation—incorporating Gaussian blur, brightness adjustment, tone modification, and geometric stretching to simulate variable field conditions—the dataset was expanded to
8028 images, split into training, validation, and test sets.In the image preprocessing stage, HSV color space threshold segmentation was used to extract maize plant regions, effectively eliminating background noise from soil and debris. For the deep learning model, the Dense-BAM model was developed by integrating the BAM attention module after each dense block of DenseNet-169 and combining it with the Label-Distribution-Aware Margin (LDAM) loss function. The BAM module enhances feature extraction by first modeling channel-wise dependencies via global average pooling and multi-layer perceptrons, then capturing spatial saliency through dilated convolutions, while LDAM optimizes class boundary learning by introducing inverse margin terms based on class frequency, particularly benefiting minority classes like extreme drought.Experimental results showed that the Dense-BAM model achieved an accuracy of 99.59% on the test set, which was 3.91 percentage points higher than the original DenseNet-169. Compared with MobileNet-V3, ResNet50, VGG16, Vit_b_16, and Convnext_small, the accuracy was increased by 14.03, 5.65, 4.07, 6.73, and 4.82 percentage points respectively. Ablation studies confirmed that BAM outperforms attention mechanisms such as CBAM, SE, LSK, and MLA, with multi-point embedding (after all dense blocks) balancing feature refinement across scales.This study provides a high-precision method for drought stress diagnosis in jointing-stage maize, offering technical support for intelligent water management in agricultural production. Future work will focus on expanding the dataset to include more varieties and growth stages, and integrating multi-modal data such as thermal infrared images and 3D point cloud models to further improve the generalization ability of the model.