基于特征点检测的生猪行为识别方法

倪彭飞; 周素茵; 叶俊华; 徐爱俊

doi:10.11975/j.issn.1002-6819.202408188

基于特征点检测的生猪行为识别方法

Recognizing pig behavior using feature point detection

摘要

摘要: 生猪行为与其健康状况密切相关，生猪日常活动时的身体特征信息反映了其行为状态，在生猪特征的提取过程中，由于生猪姿态多变，导致现有的生猪特征提取方法复杂且效率低，从而影响了识别的效果。为此，该研究构建了生猪特征点检测模型YOLO-ASF-P2以提取生猪关键部位的特征点时序信息，再结合特征点的时序信息进而构建了生猪行为识别模型CNN-BiGRU识别生猪的坐、站和躺3种行为。YOLO-ASF-P2以YOLOv8s-Pose为基线模型，结合其骨干网络中P2层的大分辨率特征图充分挖掘更多目标特征，同时采用ASF架构改进了其特征融合部分。首先，应用尺度特征融合模块SSFF对齐不同尺度特征图，提升了模型对多尺度特征信息的融合能力；其次，应用三重特征编码模块TFE，均衡多尺度特征信息的细节，避免模型丢失目标特征；最后，通过通道和位置注意力机制CPAM捕获特征点的空间信息，精准检测生猪的特征点。CNN-BiGRU通过双向门控循环单元和注意力机制灵活捕捉生猪特征点时序信息并进行加权处理，有效结合生猪特征点的时序特征，高效识别生猪的行为。经验证，YOLO-ASF-P2的检测精度为92.5%，召回率为90.0%，平均精度（AP_50～95）为68.2%，浮点运算次数为39.6 G，模型参数量为18.4 M。CNN-BiGRU模型针对生猪的坐、站和躺3种行为的平均识别精度为95.8%，浮点运算次数为27.1 G，模型参数量为0.151 M。综上，该研究提出的生猪特征点检测模型精度较高且轻量化，能够有效应对生猪姿态多变对特征点准确检测的挑战。同时，生猪行为识别模型结合生猪特征点时间域信息能有效识别生猪的坐、站和躺3种行为，为生猪的行为识别提供了新思路。

Abstract: Pig farming has been shifting towards the intensive and intelligent development in recent years, particularly with the advancements in artificial intelligence, deep learning and automation technologies. The machine vision and deep learning can be integrated to realize the non-invasive individual identification and behavior monitoring. It is crucial to determine the characteristic information that generated by pigs during daily activities. However, the existing extraction of pig feature has confined to the complex and inefficient recognition on the pig behavior, due to the frequent changes in pig posture. In this study, a feature points detection model (YOLO-ASF-P2) was proposed to extract the feature points in the key areas of the pig's body. Additionally, a pig behavior recognition model (CNN-BiGRU) was also introduced to combine the temporal information from the feature points. Firstly, the video and image data of pigs were collected by multi-angle cameras that deployed in the pig house. Two datasets were then formed for the pig feature point detection and behavior recognition. Traditional extraction of pig feature was often associated with the complex calculations, redundant feature information and low model robustness. Therefore, the original YOLOv8s-Pose model was improved to result in the YOLO-ASF-P2 model. The feature information of the P2 detection layer was utilized for the small targets. The attention scale sequence fusion (ASF) architecture was combined to focus on the key feature points of live pigs. The scale sequence feature fusion module (SSFF) was used the Gaussian kernel and nearest neighbor interpolation, in order to align the multi-scale feature maps of different downsampling rates (such as P2, P3, P4, and P5 detection layers). The same resolution was obtained as the high-resolution feature map. The triple feature encoding (TFE) module was used to capture the local fine details of small targets, and then fuse the local and global feature information. The channel and position attention mechanism module (CPAM) was used to capture and refine the spatial positioning information related to small targets. The important feature was effectively extracted from the feature map in different channels. The positioning accuracy of the model was also improved. The CNN-BiGRU model was used to recognize the pig behavior. The bidirectional gated recurrent unit (BiGRU) units were also utilized to capture the forward and backward information of sequence data in a bidirectional manner. The output was then weighted using the attention mechanism module (AttentionBlock). The excellent and stable performance was achieved in the self-built dataset. The average recognition accuracy of the model reached 96% for the three behaviors of sitting, standing, and lying. Specifically, the detection accuracy of YOLO-ASF-P2 reached 92.5%, the recall rate was 90%, the average precision (AP_50~95) was 68.2%, the parameter volume was only 18.4 M, and the performance was 39.6 G. These values were 1.1%, 2.3%, 1.5%, and 32.9% higher than those of the original model, respectively, where the model parameter volume was reduced by 17.5%. The average precision (AP_50-95) and accuracy of YOLO-ASF-P2 were improved by 17.4% and 2.9%, respectively, compared with the MMPose. While almost the same level of recall was maintained to enhance the performance of detection. The YOLO-ASF-P2 was improved the accuracy, recall rate and average precision (AP_50-95) with the reduced number of parameters, compared with the RTMPose. The similar accuracy was achieved, compared with the YOLOv5s-Pose. Both the recall rate and average precision (AP_50-95) were improved, compared with the YOLOv5s-Pose and YOLOv7s-Pose. The slightly lower accuracy was also observed, compared with the YOLOv7s-Pose. The lightweight model was achieved in the better performance to recognize the pig feature points. The CNN-BiGRU model of pig behavior also shared the high average recognition accuracy and stable performance. The parameter volume was 0.151 M, and the performance was 27.1 G. In summary, the integrated YOLO-ASF-P2 and CNN-BiGRU models were significantly improved the accuracy and robustness of pig feature point detection and behavior recognition. The finding can also offer the valuable tools for the intensive and intelligent development of pig farming.

HTML全文

参考文献(40)

施引文献

资源附件(0)