Abstract:
Chili flower target detection can serve as one of the most important steps during mechanical pollination in modern agriculture. It is of great significance to accurately detect chili flowers in natural environments. This study aims to propose a lightweight and efficient detection model (named YOLOv8n-Chili Flower) using the YOLOv8n architecture. Multiple modifications were also carried out to enhance the detection accuracy, sensitivity, and computational efficiency suitable for resource-constrained scenarios, such as mobile pollination robots. Firstly, an Efficient Multi-scale Lightweight Attention Mechanism Module (EMA) was introduced into the neck layer, in order to capture and recognize the multi-scale features of chili flowers. Specifically, the targets were also detected in complex natural environments, such as occlusion, varying lighting conditions, and dense foliage. The EMA module significantly improved the detection sensitivity and accuracy. The robust performance was obtained to focus the critical features under the demanding scenarios. Secondly, the conventional C2f module in the backbone layer was replaced with a Group Separable Convolution (GSConv) module. The information redundancy was effectively reduced during extraction while preserving the key features. The GSConv module was utilized to enhance the effectiveness of the attention mechanism. The model architecture was simplified to reduce the computational complexity. Real-time detection was also realized on low-computing-power devices, like embedded systems. Finally, the Weighted Intersection over Union (WIoU) loss function was used to replace the traditional Complete Intersection over Union (CIoU) loss, in order to optimize the regression loss. Additionally, a smoothing term was introduced to improve the precision of the overlap area computation between predicted and ground-truth bounding boxes. Experimental results show that the YOLOv8n-Chili Flower model achieved a recall rate of 94.6% and a mean average precision (mAP) of 95.9%, which were improved by 0.9 and 0.6 percentage points over the original one. In terms of computational efficiency, the modified model reduced FLOPs to 7.2 G, the parameters to 2.39 M, and the model size to 5.0 MB, which were reduced by 12.20%, 20.60%, and 20.63%, respectively. Compared with the state-of-the-art models, like YOLOv5s, YOLOv7tiny, YOLOv8s, and YOLOv9, there was a superior balance between detection accuracy and lightweight. The improved model was then deployed on an NVIDIA Jetson AGX Orin computing platform for the real-world test. An 83.25% correct detection rate and 99.02 frame per second processing speed were achieved to outperform the existing solutions. This finding can also provide technical support for real-time chili flower detection and lightweight deployment during mechanical pollination.