基于弱监督多注意融合网络的细粒度图像识别-现代信息科技

点击排行

当前位置>主页 > 期刊在线 > 计算机技术 >

计算机技术22年21期

基于弱监督多注意融合网络的细粒度图像识别

黄程 ¹，²，曾志高 ¹，²，朱文球 ¹，²，文志强 ¹，²，袁鑫攀 ¹，²

（1. 湖南工业大学计算机学院，湖南株洲 412007；2. 湖南省智能信息感知及处理技术重点实验室，湖南株洲 412007）

摘要：针对细粒度图像识别任务中常见的判别性区域难以定位的问题，提出了一种弱监督多注意融合网络，该网络通过两种注意力模块的组合实现判别性区域的准确定位。其中，双域自注意力模块将多种注意力结合起来，强化模型对关键特征的提取。混合卷积注意力融合模块分别通过并行和串行架构融合不同尺度的注意力，充分获取特征间的全局及局部联系。实验结果表明，所提出的方法是有效的，与基线模型的结果相比有较大幅度的提升。

关键词：细粒度图像分类；深度学习；注意力机制；多注意融合

DOI:10.19850/j.cnki.2096-4706.2022.21.019

基金项目：科技创新 2030—“新一代人工智能”重大项目（2018AAA0100400）；湖南省教育厅项目（21A0350，21C0439，20K046）；湖南省自然科学基金（2022JJ50051，2020JJ6088，2022JJ30231）

中图分类号：TP183 文献标识码：A 文章编号：2096-4706（2022）21-0078-06

Fine-Grained Image Recognition Based on Weakly-Supervised Multi-Attention Fusion Network

HUANG Cheng^1,2, ZENG Zhigao^1,2, ZHU Wenqiu^1,2, WEN Zhiqiang^1,2, YUAN Xinpan^1,2

(1.College of Computer Science, Hunan University of Technology, Zhuzhou 412007, China; 2.Hunan Key Laboratory of Intelligent Information Perception and Processing Technology, Zhuzhou 412007, China)

Abstract: For the common problem of discriminative regions that are difficult to locate in fine-grained image recognition tasks, a weakly-supervised multi-attention fusion network is proposed. The network achieves accurate localization of discriminative areas through the combination of two attention modules. The dual-domain self-attention module combines multiple attentions to enhance the model's extraction of key features. The hybrid convolutional attention fusion module fuses attention of different scales through parallel and serial architectures respectively to fully capture the global and local connections among features. The experimental results show that the proposed method is effective, which has a relatively large improvement compared with the results of the baseline model.

Keywords: fine-grained image recognition; deep learning; attention mechanisms; multi-attention fusion

参考文献：

[1] 罗建豪，吴建鑫 . 基于深度卷积特征的细粒度图像分类研究综述 [J]. 自动化学报，2017，43（8）：1306-1318.

[2] KRIZHEVSKY A，SUTSKEVER I，HINTON G E.Imagenet Classification with Deep Convolutional Neural Networks [J].Commun ACM，2017，60（6）：84-90.

[3] HE K，ZHANG X Y，REN S Q，et al.Deep Residual Learning for Image Recognition [J/OL].arXiv：1512.03385 [cs.CV]. [2022-04-07].https://arxiv.org/abs/1512.03385.

[4] LONG J，SHELHAMER E，D A R R E L L T. F u l l y Convolutional Networks for Semantic Segmentation [J/OL].arXiv： 1411.4038 [cs.CV].[2022-04-09].https://arxiv.org/abs/1411.4038.

[5] REDMON J，FARHADI A.YOLOv3：An Incremental Improvement [J/OL].arXiv：1804.02767 [cs.CV].[2022-05-03].https:// arxiv.org/abs/1804.02767.

[6] ZHANG N，DONAHUE J，GIRSHICK R，et al.Part-Based R-CNNs for Fine-Grained Category Detection [C]//13th European Conference on Computer Vision.Zurich：ECCV，2014：834-849.

[7] DONAHUE J，JIA Y Q，VINYALS O，et al.DeCAF：A Deep Convolutional Activation Feature for Generic Visual Recognition [C]//31st International Conference on Machine Learning.Beijing： JMLR，2014，32：647-655.

[8] BRANSON S，HORN G V，BELONGIE S，et al.Bird Species Categorization Using Pose Normalized Deep Convolutional Nets [J/OL].arXiv：1406.2952 [cs.CV].[2022-05-21].https://arxiv.org/ abs/1406.2952.

[9] LIN T Y，ROYCHOWDHURY A，MAJI S.Bilinear CNN Models for Fine-Grained Visual Recognition [C]//2015 IEEE International Conference on Computer Vision.Santiago：IEEE，2015：1449-1457.

[10] KONG S，FOWLKES C.Low-Rank Bilinear Pooling for Fine-Grained Classification [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu：IEEE，2017：7025-7034.

[11] YU C J，ZHAO X Y，ZHENG Q，et al.Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition [C]//15th European Conference on Computer Vision.Munich：ECCV，2018：574-589.

[12] TAN M，WANG G J，ZHOU J，et al.Fine-Grained Classification Via Hierarchical Bilinear Pooling with Aggregated Slack Mask [J].IEEE Access，2019，7：117944-117953.

[13] CUI Y，ZHOU F，WANG J，et al.Kernel Pooling for Convolutional Neural Networks [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu：IEEE，2017：3049-3059.

[14] WANG L，HE K，FENG X，et al.Multilayer Feature Fusion with Parallel Convolutional Block for Fine-Grained Image Classification [J].Applied Intelligence，2022，52：2872-2883.

[15] ZENG R，HE J S.Grouping Bilinear Pooling for Fine-Grained Image Classification [J].Applied Science，2022，12（10）：5063.

[16] FU J L，ZHENG H L，MEI T.Look Closer to See Better： Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu：IEEE，2017：4476-4484.

[17] PENG Y X，HE X T，ZHAO J J.Object-Part Attention Model for Fine-Grained Image Classification [J].IEEE Transactions on

Image Processing，2017，27（3）：1487-1500.

[18] SUN M，YUAN Y C，ZHOU F，et al.Multi-Attention Multi-Class Constraint for Fine-Grained Image Recognition [C]//15th European Conference on Computer Vision.Munich：ECCV，2018：834-850.

[19] WANG Y M，MORARIU V I，DAVIS L S.Learning a Discriminative Filter Bank within a CNN for Fine-Grained Recognition [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake：IEEE，2018：4148-4157.

[20] YE Z H，HU F Y，LIU Y，et al.Associating MultiScale Receptive Fields for Fine-Grained Recognition [J/OL].arXiv： 2005.09153 [cs.CV].[2022-06-12].https://arxiv.org/abs/2005.09153.

[21] LIU X Z，ZHANG L F，LI T，et al.Dual Attention Guided Multi-Scale CNN for Fine-Grained Image Classification [J].Information Sciences，2021，573：37-45.

[22] 李昆仑，王怡辉，陈栋，等 . 结合注意力与双线性网络的细粒度图像分类 [J]. 小型微型计算机系统，2021，42（5）：1071-1076.

[23] HU J，SHEN L，SUN G.Squeeze-and-Excitation Networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake：IEEE，2018：7132-7141.

[24] ZHANG Q L，YANG Y B.SA-Net：Shuffle Attention for Deep Convolutional Neural Networks [J/OL].arXiv：2102.00240 [cs. CV].[2022-06-14].https://arxiv.org/abs/2102.00240.

[25] WOO S，PARK J，LEE J-Y，et al.CBAM：Convolutional Block Attention Module [C]//15th European Conference on Computer Vision.Munich：ECCV，2018：3-19.

[26] ZHENG H L，FU J L，MEI T，et al.Learning MultiAttention Convolutional Neural Network for Fine-Grained Image Recognition [C]//2017 IEEE International Conference on Computer Vision.Venice：IEEE，2017：5219-5227.

[27] ZHANG H，GOODFELLOW I，METAXAS D，et al.Selfattention generative adversarial networks [J/OL].arXiv：1805.08318 [stat. ML].[2022-06-21].https://arxiv.org/abs/1805.08318.

[28] WAH C，BRANSON S，WELINDER P，et al.The CaltechUCSD Birds-200-2011 Dataset，CNS-TR-2011-001 [R].Pasadena， CA：California Institute of Technology，2011.

[29] MAJI S，RAHTU E，KANNALA J，et al.Fine-Grained Visual Classification of Aircraft [J/OL].arXiv：1306.5151 [cs.CV].[2022- 06-23].https://arxiv.org/abs/1306.5151.

[30] KRAUSE J，STARK M，DENG J，et al.3D object representations for Fine-Grained Categorization [C]//2013 IEEE International Conference on Computer Vision Workshops.Sydney： IEEE，2013：554-561.

[31] WEN Y D，ZHANG K P，LI Z F，et al.A Discriminative Feature Learning Approach for Deep Face Recognition [C]//14th European Conference on Computer Vision.Amsterdam：ECCV，2016：499-515.

作者简介：黄程（1997—），男，汉族，江西宜春人，硕士研究生在读，研究方向：机器学习、图像处理；曾志高（1973—），男，汉族，湖南株洲人，教授，博士，研究方向：机器学习、智能信息处理；朱文球（1968—），男，汉族，湖南株洲人，教授，硕士，研究方向：人工智能；文志强（1973—），男，湖南湘乡人，汉族，教授，博士，研究方向：模式识别；袁鑫攀（1982—），男，湖南株洲人，汉族，副教授，博士，研究方向：计算机视觉。

上一篇：机械制图电子作业批管系统开发与应用

下一篇：基于 AlphaPose 的井下不安全行为监测方法