摘 要:针对传统表面缺陷检测算法检测效率低下,难以应对复杂性检测等问题,结合深度学习和注意力机制技术,提出一种新型注意力机制算法。首先,反思卷积神经网络(CNN)与 Transformer 架构,重新设计高维特征提取模块;其次,改进最新注意力机制来捕获全局特征。该算法可轻松嵌入各类 CNN,提升图像分类和表面缺陷检测的性能。使用该算法的 ResNet 网络在CIFAR-100 数据集和纺织品缺陷数据集上的准确率分别达到 83.22% 和 77.98%,优于经典注意力机制 SE 与最新的 Fca 等方法。
关键词:缺陷检测;注意力机制;卷积神经网络;图像分类
DOI:10.19850/j.cnki.2096-4706.2023.03.035
中图分类号:TP391.4 文献标识码:A 文章编号:2096-4706(2023)03-0151-04
Application of Image Classification Algorithm Based on Soft Attention Mechanism in Defect Detection
FANG Zongchang, WU Sijiu
(Chengdu University of Information Technology, Chengdu 610225, China)
Abstract: Aiming at the problems of traditional surface defect detection algorithms, such as low detection efficiency and it has difficulty to deal with complexity detection, a new attention mechanism algorithm is proposed by combining deep learning and attention mechanism technology. First, rethink profoundly the Convolutional Neural Networks (CNN) and Transformer architecture, and redesign the high-dimensional feature extraction module; secondly, improve the latest attention mechanism to capture global features. This algorithm can easily embed various CNN, improve the performance for image classification and surface defect detection. The accuracy of the ResNet network using this algorithm on the CIFAR-100 data set and the textile defect data set reaches 83.22% and 77.98% respectively, which is superior to the classical attention mechanism SE and the latest Fca and other methods.
Keywords: defect detection; attention mechanism; Convolutional Neural Network; image classification
参考文献:
[1] 陶显,侯伟,徐德 . 基于深度学习的表面缺陷检测方法综述 [J]. 自动化学报,2021,47(5):1017-1034.
[2] 任欢,王旭光 . 注意力机制综述 [J]. 计算机应用,2021,41(S1):1-6.
[3] HU J,SHEN L,SUN G. Squeeze-and-Excitation Networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:IEEE,2018:7132-7141.
[4] QIN Z Q,ZHANG P Y,WU F,et al. Fcanet: Frequency Channel Attention Networks [C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).Montreal:IEEE,2021:763-772.
[5] WANG Q L,WU B G,ZHU P F,et al. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks [C]//2020IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE,2020:11531-11539.
[6] HOU Q B,ZHOU D Q,FENG J S. Coordinate Attention for Efficient Mobile Network Design [J/OL].arXiv:2103.02907 [cs.CV]. [2022-09-16].https://arxiv.org/abs/2103.02907.
[7] DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale [J/OL].arXiv:2010.11929 [cs.CV].[2022-09-16].https://arxiv.org/ abs/2010.11929v1?utm_medium=email.
[8] HOWARD A G,ZHU M,CHEN B,et al. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications [J/OL].arXiv:1704.04861 [cs.CV].[2022-09-16].https://arxiv.org/ abs/1704.04861.
[9] HAN Q,FAN Z J,DAI Q,et al. Demystifying Local Vision Transformer: Sparse Connectivity, Weight Sharing, and Dynamic Weight [J/OL].arXiv:2106.04263 [cs.CV].[2022-09-16].https://arxiv.org/ abs/2106.04263v1.
[10] LIU Z,MAO H Z,WU C-Y,et al. A Convnet for the 2020s [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).New Orleans:IEEE,2022:11966-11976.
作者简介:方宗昌(1999—),男,汉族,山东菏泽人,硕士研究生在读,研究方向:计算机视觉;吴四九(1970—),男,汉族,四川成都人,教授,本科,研究方向:人工智能及数据挖掘、图形图像处理及应用。