Abstract:
Precise monitoring of perinatal behaviors in ewes is crucial to improve reproductive efficiency with the low risk of dystocia in an intelligent livestock system. However, the current recognition of perinatal behaviors from video data has remained a major challenge, due to the long temporal duration, subtle motion patterns, and high inter-class similarity among behavioral features. In this study, a robust and efficient multi-modal framework was developed to recognize skeleton behavior using an improved Two-Stream Adaptive Graph Convolutional Network (2s-AGCN). A graph deep learning model was designed to capture both spatial and temporal dependencies in complex ewe behaviors. (1) Weighted Adaptive Graph Convolution Layer (W-AGCL) was introduced to overcome the limited adaptability of fixed skeletal topology when representing subtle or dynamic motion. The connection strengths between skeletal nodes were dynamically adjusted to adaptively learn the most informative spatial relationships, according to the behavioral context. The adaptive weighting mechanism enhanced the model sensitivity to spatial structure variations and the robustness to noise and individual differences among ewes. (2) Spatial Temporal Enhanced Attention Module (STE) was developed for the even temporal distribution and the presence of micro-movements in perinatal behaviors. The spatiotemporal regions were selectively emphasized to assign the higher attention weights into frames and joints with discriminative information, thereby improving the network to capture subtle but behaviorally significant cues. Furthermore, a four-stream graph convolutional architecture was proposed for the multi-modal feature fusion. Simultaneously, four complementary modalities were processed, including the Joint Stream representing skeletal joint coordinates, the Bone Stream capturing limb connectivity and orientation, the Joint Motion Stream describing temporal displacement of joints, and the Bone Motion Stream modeling dynamic variations in bone vectors. Among them, feature learning and deep interaction were integrated to clarify the static posture configurations and dynamic motion. A video dataset was constructed for the ewe perinatal behavior, including annotated samples of typical behaviors, such as standing, lying, turning, nest-building, and lambing. Experiments were then conducted to verify the model. The results show that the improved 2s-AGCN model achieved a Top-1 classification accuracy of 86.21%, thus outperforming several skeleton action recognition models, including ST-GCN, ST-GCN++, PoseC3D, Shift-GCN, CTR-GCN, and the original 2s-AGCN. Specifically, the Top-1 accuracies were 7.81%, 8.41%, 7.95%, 7.26%, 7.11%, and 6.89%, respectively. The better performance was achieved in the compact architecture with 5.70 million parameters, and the average inference latency per frame was 15.5 ms, fully supporting real-time monitoring in farm environments. The skeleton graph convolutional models were used to recognize the fine-grained animal behaviors in the perinatal period. The improved 2s-AGCN framework effectively balanced accuracy, efficiency, and real-time inference. Spatiotemporal dependencies were adaptively learned to integrate multi-modal skeletal information. The findings can provide a powerful tool for automatic behavior in smart sheep farming. Deep graph learning can be expected to monitor livestock in future applications, such as early lambing prediction, automatic reproduction, and precision welfare assessment in smart pastures.