Abstract:
Maize is one of the most vital crops in national food security. Among them, the cross-pollination is essential to the maize yield during maize growth. The detasseling is often required for the cross-pollination. The maize tassel can also be detected to guide the detasseling process. Accurate detection and counting of the maize male tassels are also crucial to the safety of maize production. Mechanical operation can also be expected to offer the high efficiency and accuracy suitable for large-scale field management. Existing research has explored tassel detection and its solutions. However, it is limited to a large number of maize varieties in order to meet the practical demands of field applications. Furthermore, many fewer features are easily obscured by the background, due to the smaller tassels of the diverse maize varieties in the field and their varying morphologies. The larger tassels are prone to mutual occlusion. Current drone technology can be expected for the phenotypic analysis of various crops. This study aims to efficiently detect the maize tassels over multiple varieties in an actual field using drone technology with deep learning. The high-resolution images were captured by unmanned aerial vehicles (UAVs). An image dataset was then constructed with a total of 74 maize varieties, termed the multiple varieties’ maize tassel dataset. The YOLOv9m was utilized as the foundation. Its feature extraction was enhanced for the maize tassels of the varied shapes and sizes. There were the interactive cross-layer fusion feature enhancement module and the multi-scale axial awareness module. The multi-level feature fusion was employed to balance the different levels of the feature maps. The characteristic information of the maize tassels was enhanced to avoid the background interference and blocking, particularly for the high efficiency of the feature fusion. The multi-scale axial awareness module was employed to extract the key characteristic information for the diverse varieties of the maize male tassel. Global and local tassel information were integrated to extract the specific traits. The results show that the improved model was achieved in a precision of 92.9%, a recall of 92.5%, and a mean average precision (mAP@0.5) of 93.9%. The mAP@0.5 value was 0.2, 1.9, 1.6, and 9.2 percentage points higher than those of YOLOv9m, YOLOv10m, YOLOv11m, and Rtdetr-m, respectively. The high efficacy was observed in detecting the various types of maize tassels. This approach effectively reduced the occurrence of misdetections and the omission of male tassels. Additionally, the high effectiveness was also observed in detecting the male tassels in the occluded and dense scenes. The generalizability of the model was validated using the public maize tassel detection and counting (MTDC) dataset. The improved model was achieved in a precision of 91.9%, a recall of 85.9%, and an mAP@0.5 of 92.1%. The mAP@0.5 value increased by 1.6, 6.3, 3.3, and 11.6 percentage points, respectively, compared with the YOLOv9m, YOLOv10m, YOLOv11m, and Rtdetr-m. Two datasets also demonstrated the promising potential to identify the tassels in complex scenarios. This finding can also provide the basic technical support to the subsequent high-throughput phenotypic analysis and yield estimation of the maize.