高级检索+

基于ResNeSt和改进Transformer的多标签图像分类算法

Multi-Label Image Classification Algorithm Based on ResNeSt and Improved Transformer

  • 摘要: 目前,基于深度学习的多标签分类算法还存在一些问题,如标签之间的相关性有待提高,如何解决小目标分类等。为此提出了一种多标签图像分类算法,该算法使用分裂注意力网络ResNeSt进行特征提取,并使用BatchFormerV2与Transformer形成双分支网络对特征进行编码,解码阶段使用Transformer Decoder的交叉注意模块来自适应地处理特征以达到更好的分类效果。实验结果表明:该模型在COCO数据集上的mAP为88.4%,在VOC2007数据集上的平均精度为96.0%,一定程度上提高了多标签图像分类的准确率。

     

    Abstract: At present, there are still some problems in the multi-label classification algorithm based on deep learning, such as the relevance between labels needs to be improved, and how to solve the problem that small targets are more difficult to identify than large targets. In this paper, we propose a multi-label image classification algorithm that uses the split attention network ResNeSt for feature extraction and uses a dual-branch Transformer to query class labels. In addition, we use the cross-attention module in Transformer Decoder to extract the local features adaptively. On this basis, in order to enhance the classification effect of the Transformer module, we introduce BatchformerV2 to make the Transformer form a double-branch network.The mAP of the model on the COCO dataset is 88. 4%, and the average precision on the VOC2007 dataset is 96. 0%, which improves the accuracy of multi-label image classification to a certain extent.

     

/

返回文章
返回