基于轻量化PPINET的花生荚果实时识别方法

员玉良; 黄劲龙; 李德豪; 王方艳; 马德新

doi:10.11975/j.issn.1002-6819.202502064

基于轻量化PPINET的花生荚果实时识别方法

Real-Time peanut pod recognition method based on lightweight model PPINET

摘要

摘要: 传统CNN算法在花生荚果外观识别任务中存在内存密集型和计算密集型问题，以及其在资源受限的边缘终端上部署困难，基于此该文提出了一种高效的花生荚果识别模型——PPINET（peanut pod identification network），以适应嵌入式设备的资源限制需求。该模型通过结合深度可分离卷积和倒残差结构显著降低了参数量和计算量，同时保留了特征提取能力，并引入MQA（multi-query attention）模块增强关键特征提取，并利用TuNAS（easy-to-tune and scalable implementation of efficient neural architecture search with weight sharing）策略优化模型结构，使其在资源受限设备上表现优异。此外，采用RESNET（residual neural network）进行知识蒸馏配合三折交叉验证训练提升精度，最终量化为RKNN格式并在瑞芯微RK3588上实现NPU加速部署，满足实际应用需求。PPINET模型尺寸仅为1.85 MB，参数量为0.49 M，浮点运算数为0.30 G。PPINET在花生荚果分类中表现优异，准确率达98.65%，在RK3588上推理速度达321 fps。该模型具备较高的识别准确率和快速的识别速度，能够实现花生荚果的实时精准检测。

Abstract: To address the challenges posed by traditional convolutional neural network (CNN) algorithms in peanut pod appearance recognition, such as high memory consumption, computational complexity, and difficulty of deployment on resource-constrained edge devices, this study proposed an efficient, lightweight, and real-time identification model named PPINET (peanut pod identification network). Designed specifically for embedded systems, PPINET achieved high accuracy and low latency while significantly reducing computational overhead, making it highly suitable for intelligent agricultural applications. The architecture of PPINET integrated depthwise separable convolutions and inverted residual blocks, which effectively reduced both the number of parameters and floating-point operations (FLOPs), without compromising the model’s ability to extract discriminative features from peanut pod images. This lightweight backbone enabled the model to run efficiently on low-power edge devices. To further enhance feature extraction, the model incorporated a lightweight attention module, the multi-query attention (MQA) mechanism. Specifically optimized for embedded deployment, MQA strengthened the network’s focus on key features, thereby improving classification accuracy and robustness under variable conditions. To adaptively optimize the model structure for deployment environments with strict resource constraints, this study adopted TuNAS, a scalable and easy-to-tune framework for efficient neural architecture search (NAS) with weight sharing. Based on TuNAS, the study designed a convolutional unit called Tun, which integrated multiple configurable inverted residual blocks and allowed dynamic architectural adjustments according to application needs. This design ensured that PPINET remained adaptable and efficient across various hardware platforms. Several training strategies were employed to further enhance model performance. A refined image preprocessing pipeline was introduced, which included contour extraction and precise cropping to mitigate distortions caused by traditional image scaling. This process significantly improved the quality of training samples and enhanced the model’s generalization ability across different peanut pod appearances. In addition, a cosine annealing learning rate schedule was applied to dynamically adjust the learning rate, accelerating convergence while avoiding suboptimal local minima. For deployment, the model was quantized into the RKNN format, enabling hardware-level acceleration on a Rockchip RK3588 platform equipped with a neural processing unit (NPU). The quantization process significantly reduced memory usage and inference latency while maintaining classification performance. The final model size was only 1.85 MB, with 0.49 million parameters and a computational complexity of just 0.30 GFLOPs, making it highly suitable for embedded environments. Experimental results demonstrated that PPINET achieved an outstanding classification accuracy of 98.65% in peanut pod recognition tasks. Further comparative evaluations showed that the inclusion of the MQA module improved recognition accuracy by up to 2.34% over conventional attention mechanisms such as SE (squeeze-and-excitation) and CBAM (convolutional block attention module), highlighting its superior performance-to-efficiency trade-off. In deployment tests, PPINET achieved an inference speed of 321 frames per second (fps) on the RK3588 development board, significantly outperforming popular embedded systems like the Raspberry Pi 4B and Jetson Nano. This real-time processing capability enabled PPINET to be effectively used in automated peanut pod sorting systems, where rapid and accurate classification was essential for improving agricultural productivity. The model’s compact architecture and hardware-accelerated design made it particularly well-suited for smart farming applications requiring low-power, real-time AI solutions. In conclusion, PPINET successfully addressed the technical bottlenecks of deploying CNN-based recognition models on embedded devices by combining a lightweight and efficient network design, a scalable NAS framework, optimized attention mechanisms, and hardware-friendly quantization techniques. This work provided a practical and reliable solution for real-time agricultural product identification and paved the way for future innovations in edge AI applications for precision agriculture.

HTML全文

参考文献(40)

施引文献

资源附件(0)