Abstract:
Sea cucumber is required to accurately and rapidly detect under the underwater complex environments of ocean Sea cucumber is required to accurately and rapidly detect under the underwater complex environments of ocean ranching. However, the small size of the sea cucumbers target is difficult to distinguish from the background. Particularly, the challenge of detection can also be found under the weak lighting, serious noise, and occlusion due to the overlapping sea cucumbers. Therefore, this study aims to propose the YOLOv10-MECAS model improved by the YOLOv10s baseline to enhance the performance of detection. A median pooling enhanced channel attention and a spatial attention were used to design the MECAS (median-enhanced channel and spatial attention) module. MECAS effectively retained the target features and reduced the image noise. Sea cucumber features were then captured using multiscale depth wise convolution. Additionally, the SAConv (switchable atrous convolution) module was introduced to replace the standard 3×3 convolutional module in the SCDown (spatial-channel decoupling downsampling) module. The receptive field was expanded without increasing the convolution kernel size. Thereby the model was improved to capture the features of occluded targets. An enhancement algorithm was employed on the underwater image using UDCP (the underwater dark channel prior). The dataset images were enhanced to significantly optimize the contrast for the high image quality. Furthermore, the MPDIoU (minimum points distance intersection over union) regression loss function was adopted to reduce the distortion of detection boxes caused by large sample variability. Thereby the robustness of the model was enhanced. An experiment was carried out to evaluate the performance of the improved model. A dataset of sea cucumbers was sampled from real underwater scenarios. The experimental results show that the better performance of the improved model was achieved on the original dataset, with a prediction precision of 85.7%, recall of 81.5%, and the mean average precision at IoU (intersection over union) 0.5 of 89.7%, indicting the improvement by 6.4%, 4.4%, and 5.0% over the baseline model. Compared with the comparison models Faster-RCNN, SSD, YOLOv5s, YOLOv7, YOLOv8s, YOLOv9s and YOLOv11s, while maintaining advantages in terms of the number of parameters, GFLOPs (giga floating-point operations per second), and FPS (frames per seconds), the mean average precision at IoU 0.5 had been improved by 16.5, 15.4, 5.5, 6.1, 5.3, 5.8, and 4.6 percentage points. On the enhanced dataset by UDCP algorithm, the improved model was achieved in a prediction precision of 86.4%, recall of 82.6%, and the mean average precision at IoU 0.5 of 90.4%, indicating the improvement of 3.3%, 2.1%, and 4.8% over the baseline model. Compared with the comparison models Faster-RCNN, SSD, YOLOv5s, YOLOv7, YOLOv8s, YOLOv9s and YOLOv11s, the mean average precision at IoU 0.5 had been improved by 16.1, 15.4, 5.5, 5.9, 5.2, 6.1, and 4.3 percentage points. The MECAS module into the YOLOv10s also outperformed the combination of current mainstream attention modules, such as LSKA (large selective kernel attention), CA (coordinate attention) and ECA (efficient channel attention) in underwater sea cucumber detection. Finally, the experiment verified that the YOLOv10s with the MPDIoU also performed better than that with CIoU (complete intersection over union), EIoU (enhanced intersection over union), GIoU(generalized intersection over union), DIoU (distance intersection over union), and SIoU (scaled intersection over union). Consequently, the detection accuracy of small target sea cucumbers was effectively improved in complex underwater environments. The finding can provide a theoretical basis to detect the sea cucumber during harvesting.