基于改进YOLOv11n的低光照图像增强下夜间肉牛行为识别方法

刘星宇; 王芳; 任力生

doi:10.11975/j.issn.1002-6819.202504060

基于改进YOLOv11n的低光照图像增强下夜间肉牛行为识别方法

Nighttime beef cattle behavior recognition method based on improved YOLOv11n under low-light image enhancement

摘要

摘要: 针对夜间肉牛图像可见度较低、行为特征模糊、噪声干扰等问题，该研究提出一种低光照图像增强下的夜间肉牛行为识别方法。首先，采用LightenDiffusion模型对夜间肉牛行为图像进行低光照增强处理，以改善图像视觉质量。其次，对于夜间环境下肉牛行为特征不明显、识别难度大等问题，在YOLOv11n网络基础上进行改进，构建C3k2-CAFormerCGLU模块，筛选局部重要特征的同时抑制噪声和无关信息；采用空间深度转换卷积（space-to-depth convolution, SPDConv）提高远视角下小目标的检测能力；在颈部网络引入特征聚焦模块（feature focusing moudle，FFM），构建特征聚焦扩散金字塔网络（feature focusing pyramid network，FFPN），丰富各尺度特征的上下文信息；结果表明，相较于增强前的肉牛总体行为数据集，增强后YOLOv11n的平均精度均值提升了2.5个百分点；CSF-YOLOv11n对夜间肉牛行为检测平均精度均值达到了94.3%，相比基础模型YOLOv11n提高了4.9个百分点；可视化结果表明CSF-YOLOv11n能有效关注不同行为关键特征区域。CSF-YOLOv11n模型满足夜间肉牛基本行为的精准检测需求，可为肉牛健康监测与智能养殖提供技术支撑。

Abstract: Behavior recognition of beef cattle has been one of the most essential means in the intelligent breeding and health monitoring of cattle sheds. Its recognition often depended on the breeding environment, including the lighting conditions, stocking density, and ground conditions. However, the behavioral identification can be challenging due to the blurred boundary of individual cattle. The beef cattle, as a group animal, can usually tend to rest in groups of 3 to 5. The behavioral patterns and movement of the cattle can interfere with the accuracy of behavior recognition. Especially, the image visibility is significantly reduced during monitoring at night. In addition, the noise interference can further reduce the performance of the traditional behavior recognition under night breeding environments. Conventional recognition of beef cattle behavior can also be limited to the low image visibility, fuzzy and variable behavior, as well as the noise interference. In this study, a framework of recognition was proposed for the beef cattle at night using low-light image enhancement. Firstly, the Lighten Diffusion model was used to enhance the behavior image of beef cattle at night. The clarity and visibility of the image enhanced the target detection. Secondly, the YOLOv11n model was utilized to construct the C3k2-CAFormerCGLU module. Local important features of beef cattle were obtained using the gating mechanism. The noise and irrelevant information were suppressed to avoid misjudgment on the individual boundary of beef cattle. The beef cattle behaviors were effectively distinguished against the background noise or cattle overlapping area. The spatial-to-depth conversion convolution (SPDConv) was used to improve the original subsampling of the model. The spatial and depth information flow was optimized in the convolution operation. There was a more accurate feature expression of the beef cattle at a long distance. The recognition accuracy of the beef cattle behavior was improved from a far perspective. Finally, a Feature Focusing module was introduced into the neck Network. The Feature Focusing Pyramid Network (FFPN) was constructed to enhance the feature expression of the target region. The fine-grained feature was extracted under different receptive fields. At the same time, a cross-scale feature fusion was adopted to make the features with rich context information, in order to propagate effectively between different detection scales. Furthermore, the multi-scale target recognition was improved in complex environments. The experimental results showed that the image enhancement was achieved in the four unsupervised low-light enhancement algorithms, including Zero-DCE, Zero-DCE++, Enlighten GAN, and Lighten Diffusion, compared with the low-light image enhancement dataset. The Lighten Diffusion algorithm performed best in the low-light image enhancement tasks, with the highest peak signal-to-noise ratio (19.79), the highest structural similarity (91.7%), and the lowest mean squared error (691.46). The dataset of beef cattle behavior before and after the enhancement of the Lighten Diffusion was sent into the YOLOv11n target detection model for training. The average recognition accuracy of beef cattle behavior increased by 2.5 percentage points after the enhancement. The accuracy rate, recall rate, and average accuracy of the improved CSF-YOLOv11n model reached 93.9, 87.3, and 94.3, respectively, in terms of the target detection. Compared with the Faster-RCNN, RT-DETR, YOLOv5n, YOLOv7, YOLOv8n, YOLOv9t, YOLOv10n, and YOLOv11n, the average accuracy of the model increased by 8.4, 7.4, 5.9, 6.4, 5.6, 6.4, 5.7, and 4.9 percentage points, respectively. The finding can also provide a strong reference to realize the healthy breeding of beef cattle under all weather conditions.

HTML全文

参考文献(35)

施引文献

资源附件(0)