Abstract
Pig counting is a crucial task in modern pig farming, playing a key role in assessing farming scale, optimizing feeding strategies, and improving management efficiency and economic benefits. However, in real farming environments, accurate pig counting faces considerable challenges due to factors such as high pig density, severe individual occlusions, and complex lighting conditions. To overcome these difficulties, this paper proposes an improved pig counting model, PIG-P2PNet, based on the crowd counting model P2PNet, aiming to enhance the model's adaptability and counting accuracy in real-world farming scenarios. Firstly, the channel attention mechanism was introduced into the backbone network, which allows the model to more effectively capture the dependencies among different channels. The overlapping pigs were effectively recognized in the densely populated environments, where the individual animals were obscured. Secondly, a coordinate channel shuffling attention was integrated with the feature pyramid. The extraction and interaction were enhanced for the spatial location information and channel features. This integration enabled the model to better handle a variety of density scenarios by considering each situation more comprehensively. In addition, this paper designs a context-aware Hungarian matching algorithm, which incorporates mechanisms such as weighted distance penalties, uncertainty costs, and adaptive density penalties. These enhancements ensure that the algorithm can better adapt to the target distribution characteristics in different regions, thereby optimizing the matching between ground truth and predicted points. Consequently, the model significantly reduces mismatches in densely populated areas and improves pig counting accuracy in challenging high-density scenarios. Furthermore, considering the imbalance between background and target samples, Focal Loss was used to replace the cross-entropy loss function in the original model, further improving the classification accuracy of the model by effectively focusing on hard-to-classify samples. To comprehensively evaluate the model's performance, the PIG-P2PNet model was validated on a self-built dataset that includes a variety of scenes, camera perspectives, and pig density levels. The results demonstrate that the PIG-P2PNet model performed best across multiple metrics, with an average absolute error, root mean square error, and normalized absolute error of 0.873, 1.502, and 0.040, respectively. Significant improvements were achieved over the original P2PNet model, with reductions of 33.9%, 22.1%, and 39.4% for each metric, respectively. The generalization of the mode was also obtained to accurately count pigs in varying scenarios. Moreover, the PIG-P2PNet model reduced the MAE by 63.3%, 54.5%, and 26.7%, respectively, compared with the classic counting models, such as CSRNet, CANNet, and CLTR. The RMSEs of PIG-P2PNet were reduced by 49.7%, 47.1%, and 13.7%, respectively, indicating its superior precision in the counting tasks. The NAE decreased by 73.5%, 56.5%, and 35.5%, respectively, further underscoring the robustness of the model. The high accuracy of the PIG-P2PNet can be expected to serve as reliable pig counting with high density and occlusion in real-world farming conditions. In summary, the PIG-P2PNet pig counting model demonstrates practical application potential in the livestock industry, particularly in environments where traditional counting failed. The point annotation and point regression were integrated to efficiently manage the counting tasks in the dense pig populations. The adaptability of the model can be further explored on large datasets during multi-object tracking for decision-making in the livestock industry.