不平衡数据下基于CNN的网络入侵检测
Network Intrusion Detection Based on CNN on Unbalanced Data
-
摘要: 深度学习的自学习能力可以实现入侵检测系统的不断更新及扩展,增强入侵检测系统的防范能力,但目前大部分基于深度学习的网络入侵检测研究都未考虑到数据集类别不平衡问题.针对此问题,提出了一种类别重组技术结合Focal Loss损失函数的处理方法,用于原始网络入侵流量分类.该方法把原始流量生成灰度图输入卷积神经网络CNN进行特征提取学习,类别重组技术保证了训练集中攻击类别间的相对均衡,而Focal Loss损失函数通过影响类别权重提高了CNN模型对复杂样本的关注.在三个CNN模型上进行了实验,macro-f1分别提高了9.41%, 1.65%和4.39%,结果表明该方法能够有效处理网络入侵检测中的类别不平衡问题,且明显提高了少数类样本的识别精度.Abstract: The self-learning ability of deep learning can realize the continuous updating and expansion of intrusion detection system, and enhance the prevention ability of intrusion detection system. At present, most of the network intrusion detection research based on deep learning has not considered the imbalance of data sets. In order to solve this problem, this paper proposed a method to classify the original network intrusion traffic by category recombination technique with Focal Loss function. The grayscale image generated by the original traffic was input into convolution neural network CNN for feature extraction and learning. The category recombination technique ensured the relative balance among attack categories in training set, and the Focal Loss function improved the attention of CNN model to complex samples by affecting the class weight. Experiments on three CNN models show that macro-f1 increases by 9.41%, 1.65%, and 4.39% respectively. The results show that this method can effectively deal with the problem of category imbalance in network intrusion detection, and significantly improve the recognition accuracy of minority samples.