Abstract:
An accurate and rapid detection is highly required for broken or damaged maize kernels in modern agriculture. However, the conventional manual approaches cannot fully meet the large-scale applications in recent years, due to the inherently error-prone tasks, labor-intensity, and time-consuming. Moreover, significant constraints have been posed on the high efficiency and scalability of modern farming. In contrast, image recognition can be expected to substantially enhance the accuracy and efficiency of broken kernel detection using deep learning, such as the SqueezeNet network. Some challenges of the SqueezeNet model still remained in the identification of small targets, such as the maize kernels. In the depth of the network, complex and multi-layered convolutions are better required to effectively process the input images. Since the deeper architectures can enhance feature extraction, substantial computational demands have also been imposed on the processing power, memory, and storage. Particularly, the real-time applications cannot fully meet the resource-constrained environments, such as the mobile devices or embedded systems that are commonly deployed in agricultural settings. In this study, an optimized variant of the SqueezeNet model was introduced to specifically detect the broken maize kernels. The architecture (termed SqueezeNet-dw2) was used to enhance the original SqueezeNet framework. The computational complexity was reduced to improve the efficiency more suitable for real-time agricultural applications. Several key modifications were also introduced into the classic SqueezeNet architecture, in order to enhance the efficiency with less computational complexity. Firstly, the number of fire layers was reduced to the input channels of the final convolutional layer. Additionally, the standard convolutions were replaced with the depthwise separable ones. The feature extraction was preserved to significantly lower the computational costs. Furthermore, the Ghost module was integrated to refine the expanding layer of the Fire module. A 3×3 convolution was also incorporated to effectively reduce the computational demands and the number of parameters. The enhanced architecture was termed SqueezeNet-dw2-gh, indicating the integration of the Ghost module. A more efficient network was obtained after refinement and is better suitable for real-time agricultural applications, compared with the original SqueezeNet. The parametric rectified linear unit (PReLU) was employed as the activation function, in order to adaptively learn the activation parameters during training. The degradation of the accuracy after network simplification was mitigated to maintain high performance with less computational complexity. The final model after optimization was termed SqueezeNet-dw2-gh-P. Experimental results show that the parameter count was reduced to 0.60 MB—a 51.61% decrease, compared with the original architecture—while the computational cost was lowered by 48.54%, with an operation count of 36.71 MFLOPs. Notably, the optimal network shared the validation and test accuracies of 93.98% and 92.33%, respectively, indicating the effectiveness and efficiency in the accurate detection of the broken maize kernels. In conclusion, the improved SqueezeNet architecture achieved substantial reductions in the parameter count, memory footprint, and computational demands. The suitability of the improved model was obtained for the deployment of resource-constrained mobile and embedded devices. The real-time detection of broken maize kernels can also offer a practical solution in modern agriculture.