基于改进DenseNet121的牛脸身份轻量化识别方法

张瑞峰; 曹姗姗; 孔繁涛; 孙伟

doi:10.11975/j.issn.1002-6819.202507166

基于改进DenseNet121的牛脸身份轻量化识别方法

Lightweight cow face identification method based on improved DenseNet121

摘要

摘要: 为解决传统接触式牛只识别方式具有维护成本高、易引发应激反应，以及现有机器视觉模型在复杂养殖环境下难以兼顾高精度与轻量化的难题，该研究提出了一种基于改进 DenseNet121 的牛脸身份轻量化识别模型 LCFI-Net（lightweight cow face identification network）。首先，将 DenseNet121 模型的主干网络结构性精简为 DenseNet_Lite，随后引入多尺度注意力密集层（multi-scale attention dense layer, MSAD-Layer），实现多尺度特征融合与注意力机制的结合，增强模型对复杂背景下关键细粒度特征的感知与聚焦能力；然后，使用倒瓶颈结构过渡层（inverted bottleneck transition layer, IBT-Layer）优化跨层信息的传递，在高效降维的同时，最大限度地保留图像特征信息的完整性；最后，以自然复杂环境下采集并构建的高质量牛脸可见光图像数据集对本文模型进行测试。结果表明，LCFI-Net 在测试集上的识别准确率为 93.54%，比基准模型 DenseNet121 提升了 2.04个百分点，同时参数量仅为1.02 M，较 DenseNet121 降低6.07 M；与 MobileNetv2、ShuffleNetv2、MobileFaceNet、ResNet50 和 ResNet18 主流模型相比，LCFI-Net 的准确率分别提升了4.50、4.46、4.08、2.75和2.29个百分点；此外，特征可视化分析结果表明，LCFI-Net 提取的特征具备更显著的类内紧凑度与类间分离度，能有效克服光照不均与大角度姿态偏转的干扰，表现出更强的鲁棒性；该研究可为移动机器人或智能装备等计算资源受限场景下的牛只身份机器视觉精准识别提供技术支撑。

Abstract: Precise identification of the individual cow can serve as one of the most fundamental prerequisites for the downstream tasks in smart animal husbandry, including precision feeding, health monitoring, and accurate estrus detection. The conventional contact identification—such as the electronic ear tags and sensory collars—has been widely adopted in recent years. Nevertheless, their application can be limited to the high maintenance costs, susceptibility to damage, and the stress response to animal welfare. Fortunately, the non-contact computer vision has emerged as a promising alternative. However, the current mainstream vision models have shared significant challenges in real-world breeding environments, such as the variable illumination in barns, diverse cow postures during movement, and the fine-grained nature of cow face features. It is often required to balance the high recognition accuracy and lightweight architecture. In this study, a lightweight cow face identification model (named LCFI-Net, Lightweight Cow Face Identification Network) was proposed using an improved DenseNet121 architecture. A three-stage structural optimization was carried out to balance between model performance and computational efficiency. Firstly, the backbone of the standard DenseNet121 was structurally pruned and then streamlined to create a DenseNet_Lite module. The redundant parameters were effectively removed to retain the essential feature extraction. Secondly, a Multi-Scale Attention Dense Layer (MSAD-Layer) was introduced to replace the standard dense blocks. A synergistic combination of the multi-scale feature fusion and attention mechanisms was enhanced to perceive the key fine-grained features—such as the specific facial patterns—against the complex, cluttered backgrounds. Thirdly, an Inverted Bottleneck Transition Layer (IBT-Layer) was utilized to further optimize the transmission of the information over the layers. Efficient dimensionality reduction and down-sampling were realized to preserve the integrity of the image feature information, thereby preventing the loss of critical details during the feature map compression. A high-quality dataset was collected from the visible light images of the cow faces in natural and complex breeding environments. The improved model was trained and then evaluated within a metric learning framework. Experimental results demonstrate that the superior performance of the architecture was achieved after optimization. On the test set, the LCFI-Net achieved a recognition accuracy of 93.54%, which was improved by 2.04 percentage points over the baseline DenseNet121 model. More significantly, the computational cost was substantially reduced in the LCFI-Net. Among them, the parameter of the LCFI-Net was only 1.02 M, which was reduced by 6.07 M, compared with the original DenseNet121. Furthermore, the comparative experiments were performed on the rest of the mainstream lightweight and heavy-duty models in order to validate the robustness of the LCFI-Net. The accuracy of the LCFI-Net was improved by 4.50, 4.46, 4.08, 2.75, and 2.29 percentage points, respectively, compared with the MobileNetv2, ShuffleNetv2, MobileFaceNet, ResNet50, and ResNet18. The LCFI-Net was introduced into the optimal structure for high accuracy and speed. In conclusion, the LCFI-Net was achieved in an optimal equilibrium between recognition precision and computational efficiency. Consequently, the finding can provide a robust technical foundation to deploy the high-precision cow identity recognition on the resource-constrained edge devices, such as the mobile inspection robots and intelligent barn equipment.

HTML全文

参考文献(40)

施引文献

资源附件(0)