Abstract:
Pig lameness is one of the most important health indicators for the production efficiency and animal welfare in modern pig farming. Existing lameness detection is mostly constrained walkway system using a single-view camera, which is highly sensitive to occlusion and self-occlusion. Walkway-based acquisition is further required for the special facilities. The stress-related and unnatural locomotion can be induced to limit the reliability and representativeness of the gait data. In this study, an end-to-end multi-view 3D pose estimation network without 3D annotations, termed the Multi-View Voxel Network (MV-VoxelNet), was proposed for the natural, non-contact, and high-precision pig lameness detection under free-moving conditions. A lameness detection was developed using reconstructed 3D skeletons. Accurate 3D pig keypoints were extracted from the synchronized multi-view RGB images using a coarse-to-fine process with the global localization followed by local refinement. MV-VoxelNet was then reconstructed after extraction. In the global localization stage, a coarse 3D representation of the pig was constructed to fuse the multi-view observations into a voxel grid. A voxel cross-view attention (VCVA) module was introduced to adaptively fuse the features from the different camera views, in order to effectively integrate the multi-view information for robustness under occlusion. The voxel features were processed by a 3D voxel-to-voxel network. An initial 3D skeleton was obtained after fusion. In the local refinement stage, a lightweight refinement voxel-to-voxel network (RefineV2V) was refined for each keypoint. A small voxel region was constructed to center on its initial prediction. More precise localization was realized with a low computational cost. The continuous 3D keypoint coordinate sequences were obtained from the consecutive video frames using the reconstructed 3D skeletons. A dynamic local coordinate system was constructed to normalize the gait representations in order to handle free movement and varying walking directions. Specifically, four hoof contact points were extracted to estimate an optimal ground support plane via singular value decomposition. The plane normal was defined as the vertical axis. The forward body direction was determined using the orientation of the trunk keypoints, together with the support plane. A stable local coordinate system was established for each frame. All 3D keypoints were then transformed into this coordinate system, effectively eliminating the influence of the global orientation for the consistent gait analysis under free-walking conditions. Furthermore, 6 gait features were designed to characterize the pig locomotion using this dynamic local coordinate system. These features captured the complementary aspects of the gait abnormality, including the hoof symmetry, trunk vertical stability, lateral trunk inclination, gait phase imbalance, swing initiation and termination consistency, and fore–hind hoof following behavior. The extracted features were subsequently fed into a support vector machine (SVM) classifier to identify the normal, mildly lame, and severely lame pigs. Results showed that the MV-VoxelNet achieved a reprojection error (RE) of 10.17 pixels and a percentage of correct keypoints at a normalized threshold of 0.05 (PCK@0.05) of 88.83% without any 3D ground-truth annotations, which improved the baseline by 13.59 percentage points, respectively. Moreover, the lameness detection was achieved in an overall classification accuracy of 88.51%, indicating the effectiveness of the framework. Overall, a unified pipeline was presented to reconstruct the 3D pig skeletons from the synchronized multi-view RGB images. The gait features were extracted in a dynamic local coordinate system. The lameness severity was classified using a machine learning classifier. The framework required no 3D annotations using only a few standard RGB cameras, indicating the low annotation cost and easy deployment. Accurate, contact-free monitoring of the pig gait can be expected in the real-world pig farms. These finding can also provide a scalable solution for the 3D pig pose modeling and lameness detection.