基于非模态实例分割的仔猪受压事件检测

薛月菊; 孙奥深; 杨玉清; 江天; 罗霞; ANNALISASCOLLO; TOMASNORTON; 甘海明

doi:10.11975/j.issn.1002-6819.202412213

基于非模态实例分割的仔猪受压事件检测

Detecting piglet crushing events using amodal instance segmentation

摘要

摘要: 哺乳期母猪挤压仔猪是造成仔猪断奶前死亡的主要原因之一。母猪对仔猪的遮挡给基于计算机视觉的挤压事件检测带来较大挑战。针对这一问题，该研究提出了基于非模态实例分割（amodal instance segmentation，AIS）的仔猪受压事件检测方法。首先，提出BCNet-FF（bilayer convolutional network-focused fusion）模型，实现高精度的母猪姿态检测和抗遮挡的猪非模态实例分割。该模型将提出的FLA-DSC（focused linear attention-depthwise separable convolution）注意力模块引入骨干网络，同时在特征金字塔网络中加入FreqFusion特征融合模块，以提高被遮挡仔猪的分割精度；其次，识别处于侧卧姿态的母猪背部区域，构造回归方程以估计母猪与地面实际接触的轮廓线，从而确定母猪挤压仔猪的危险区域。然后，使用基于相邻帧目标掩膜IoU（intersection over union）的多目标跟踪方法对危险区域内的仔猪进行跟踪。最后，计算仔猪与危险区域的重叠率，判断是否发生仔猪受压事件。实例分割试验结果表明，BCNet-FF的mAP@50为98.5%，其中仔猪分割的AP@50为97.4%。相较于BCNet（bilayer convolutional network），BCNet-FF的mAP@50和AP@50分别提高了2.2和3.1个百分点。90段测试视频的试验结果表明，仔猪多目标跟踪方法的平均IDF1 得分（identification F1 score）、多目标跟踪准确率（multiple object tracking accuracy，MOTA）分别为94.5%和93.8%。该研究方法检测仔猪受压事件的正确率为91.1%，灵敏度为90.6%，特异度为91.9%，模型整体推理速度为8.9帧/s，能够准确检测母猪侧卧挤压仔猪事件，为及时发现受压仔猪提供技术参考。

Abstract: Piglet crushing by sows is one of the primary causes of piglet mortality before weaning. The occlusion of piglets by sows can make it highly challenging to detect the crushing events using existing computer vision. In this study, an accurate and rapid detection of piglet crushing events was proposed using amodal instance segmentation (AIS). Firstly, an improved AIS model, BCNet-FF (bilayer convolutional network-focused fusion), was developed by enhancing the baseline bilayer convolutional network (BCNet). The FLA-DSC (focused linear attention-depthwise separable convolution) module, an enhanced version with focused linear attention was integrated into the backbone network in order to improve the feature extraction for the occluded objects. Additionally, the FreqFusion feature fusion enhancement module was introduced into the feature pyramid networks (FPN), which strengthened multi-scale feature representation and enhanced feature extraction and fusion for occluded object contours. Consequently, high-precision segmentation was achieved. Secondly, a custom dataset was constructed, and the video sequences were processed using BCNet-FF. Sow posture information and pig instance segmentation results was extracted. Among them, the dorsal region of the sow was detected when the posture of the sow was identified as the lateral lying. A regression equation was formulated to estimate the actual contact contour between the sow and the ground. According to this contour, a risk region was delineated to highlight the areas with a high likelihood of piglet crushing. Thirdly, a multi-object tracking algorithm was applied using the target mask intersection over union (IoU) between adjacent frames. The piglets within the risk region were tracked. Finally, the risk region overlap ratio of piglets was calculated to determine whether a crushing event had occurred. The results show that the BCNet-FF was achieved in the precision rates of 97.8%, 98.2%, and 99.5% for the standing, sternal lying, and lateral lying postures, respectively, with the recall rates of 98.2%, 97.2%, and 97.8%, respectively. Segmentation ablation experiments showed that there was an improvement of 2.2 percentage points in the mAP@50, 2.5 percentage points in the mIoU, and an increase of 3.1 percentage points in the AP@50 for piglets using BCNet-FF, compared with the BCNet. The model size increased by 47.5 MB, and the inference speed was 11.4 frames per second. In the comparative experiments, BCNet-FF demonstrated superior performance in instance segmentation of sows and piglets, achieving an AP@50 of 97.4% in piglet segmentation, thereby exceeding Mask R-CNN, CenterMask2, Transfiner, YOLOv11x-seg, and BCNet by 7.8, 4.6, 5.6, 2.3, and 3.1 percentage points, respectively. In the evaluation of 90 video samples, the multi-object tracking method for piglets achieved an average IDF1 of 94.5% and MOTA of 93.8%. The proposed method achieved an accuracy of 91.1% in detecting piglet crushing events, with a sensitivity of 90.6%, a specificity of 91.9%, and an average inference speed of 8.9 frames per second. The better performance of the BCNet-FF model was highlighted to segment the sows and piglets in the occlusion scenarios, particularly for the segmentation accuracy of occluded piglet masks. The multi-object tracking with the target mask IoU can be expected to accurately track the occluded piglets. The constructed regression equation and risk region enabled accurate differentiation between occluded and crushed piglets, which significantly improved piglet crushing event detection accuracy. The practicality and reliability can be achieved to detect the piglet crushing events. The finding can provide an effective approach to monitor the piglet crushing events using computer vision.

HTML全文

参考文献(31)

施引文献

资源附件(0)