CHEN Hongyu, YIN Ling, YANG Ming, et al. Multimodal recognition of pig behavior using vision and sensors[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2025, 41(8): 194-203. DOI: 10.11975/j.issn.1002-6819.202410206
Citation: CHEN Hongyu, YIN Ling, YANG Ming, et al. Multimodal recognition of pig behavior using vision and sensors[J]. Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 2025, 41(8): 194-203. DOI: 10.11975/j.issn.1002-6819.202410206

Multimodal recognition of pig behavior using vision and sensors

  • Pig behavior monitoring was demonstrated to be crucial in precision livestock farming, as behavioral changes not only reflected health status but also influenced production performance, reproductive efficiency, and meat quality. Breeders could adjust feeding strategies promptly to enhance production efficiency and establish an effective disease early-warning mechanism, thereby ensuring pig welfare and achieving optimal economic benefits by monitoring the behavior of pigs. However, manually observing and identifying the behavior of pig herds continuously was time-consuming, laborious, and difficult to implement. Consequently, intelligent recognition of pig behaviors emerged as a research focus, primarily employing two approaches: visual technology and sensors technology. Visual recognition methods offered cost-effectiveness and easy deployment, however their accuracy was compromised by illumination variations and occlusions, along with limited individual identification capabilities. Sensor-based techniques required animals to wear sensing nodes with higher costs and demonstrated lower accuracy in distinguishing subtle behavioral differences. To overcome the limitations of single-technology recognition and improve behavioral identification accuracy in complex scenarios, this study proposed a multimodal pig behavior recognition method that adopted different processing strategies for various scenarios by integrating visual and sensors technologies. Three distinct data scenarios were defined: insufficient light, adequate light with severe occlusion, and adequate light without occlusion (including slight occlusion). For insufficient light conditions, a sensor single-layer classification method was implemented. This method initially segmented ear tag acceleration signals into multiple time windows and extracted 49 motion features from each window. Subsequently, random forest classifiers evaluated feature importance to select key features, followed by behavioral recognition using five machine learning models: Baseline, Logit, Random Forest, SVM, and KNN. In scenarios with adequate light but severe occlusion, a video sensor dual-layer classification approach was employed. The methodology first utilized the YOLOv8 model for pig posture classification in video data, then conducted detailed behavioral classification using ear tag acceleration data based on posture recognition results. For unobstructed scenarios with sufficient light, a video FE-TSM (feature extension-temporal shift module) model was developed. This architecture integrated CBAM (convolutional block attention module) for feature enhancement with temporal shift networks, enabling the model to focus on critical feature regions while preserving essential temporal information that might otherwise be lost through excessive shifting. The experiment used visual technology, sensor technology, and multimodal methods to analyze the video and sensor data of eight Landrace pigs, each with 7-day data. The pig behaviors were classified into five categories: lying on the side, crouching, half-sitting, eating, and sporting. The results showed that the average accuracy of behavior recognition using only sensor data was 68.60%, while visual data alone achieved 78.78%. In contrast, the multimodal method integrating sensor and video data demonstrated a significantly higher average recognition accuracy of 88.82%. This indicated that the multimodal approach substantially improved the precision of pig behavior recognition and analysis in complex scenarios. By combining visual and sensors technologies, the method addressed the limitations of individual modalities, enhanced the differentiation accuracy of the five behaviors, and thereby increased the reliability of pig behavior recognition under challenging environmental conditions.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return