基于注意力和上下文的多尺度图像背景下的小目标检测方法-现代信息科技

点击排行

当前位置>主页 > 期刊在线 > 信息技术 >

信息技术23年5期

基于注意力和上下文的多尺度图像背景下的小目标检测方法

李容光，杨梦龙

（四川大学空天科学与工程学院，四川成都 610065）

摘要：在多尺度多目标的背景下，小目标由于像素少、提取特征困难，其检测精度远远低于大中目标。文章通过使用离散自注意力提取跨尺度的全局的上下文背景信息，使用跨尺度通道注意力和尺度注意力来增强模型的尺度敏感性，捕捉到更多不同的、更丰富的物体 - 物体、背景 - 物体信息，使得每一层特征层都是一个跨空间和跨尺度的拥有更丰富特征信息的特征层，从而提高在多尺度背景下小目标检测的效果。在 COCO 数据集上，本算法的 APs 高于基准 retinanet 最高达 2.9，在 DIOR 数据集上 mAP 能够达到 69.0，优于该数据集上最优算法，同时能够维持自己单阶段的速度。

关键词：目标检测；小目标检测；离散自注意力；跨尺度注意力

DOI:10.19850/j.cnki.2096-4706.2023.05.001

中图分类号：TP391.4 文献标识码：A 文章编号：2096-4706（2023）05-0001-07

Small Object Detection Method under the Background of Multi-Scale Image Based on Attention and Context

LI Rongguang, YANG Menglong

(School of Aeronautics and Astronautics, Sichuan University, Chengdu 610065, China)

Abstract: Under the background of multi-scale and multi-target, the detection accuracy of small targets is far lower than that of large and medium targets due to fewer pixels and difficulty in feature extraction. Through using discrete self-attention to extract cross-scale global context information, and using cross-scale channel attention and scale attention to enhance the scale sensitivity of the model, this paper captures more different and richer object-Object and background-object information, so that each feature layer is a feature layer with richer feature information across space and scale, thereby improving the effect of small target detection under the background of multi-scale. On the COCO data set, the APs of this algorithm are higher than the benchmark retinanet by up to 2.9, and the mAP on the DIOR data set can reach 69.0, which is better than the optimal algorithm on this data set, while maintaining its own single-stage speed.

Keywords: object detection; small object detection; discrete self-attention; cross-scale attention

参考文献：

[1] BOCHKOVSKIY A，WANG C-Y，LIAO H-Y M. Yolov4： Optimal Speed and Accuracy of Object Detection [J/OL].arXiv： 2004.10934 [cs.CV].（2020-04-23）.https://arxiv.org/abs/2004.10934.

[2] Tian Z，Shen C H，Chen H，et al. FCOS：Fully

Convolutional One-Stage Object Detection [C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV）.Seoul：IEEE， 2019：9626-9635.

[3] LIU W，ANGUELOV D，ERHAN D，et al. SSD：Single Shot Multibox Detector [C]//European Conference on Computer VisionECCV 2016.Cham：Springer，2016：21-37.

[4] GIRSHICK R. Fast R-CNN [C]//2015 IEEE international conference on Computer Vision（ICCV）.Santiago：IEEE，2015： 1440-1448.

[5] DUAN K，BAI S，XIE L X，et al. CenterNet：Keypoint triplets for Object Detection [C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV）.Seoul：IEEE，2019：6568-6577.

[6] LIU Y，SUN P，WERGELES N，et al. A Survey and Performance Evaluation of Deep Learning Methods for Small Object Detection [J].Expert Systems with Applications，2021，172（4）：114602.

[7] NOH J，BAE W，LEE W，et al. Better to Follow，Follow to be Better：Towards Precise Supervision of Feature Super-Resolution for Small Object Detection [C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV）.Seoul：IEEE，2019：9724-9733.

[8] RABBI J，RAY N，SCHUBERT M，et al. Small-Object Detection in Remote Sensing Images with End-To-End Edge-Enhanced GAN and Object Detector Network [J].Remote Sensing，2020，12（9）：1432.

[9] HUANG H X，TANG X D，WEN F，et al. Small Object Detection Method with Shallow Feature Fusion Network for Chip Surface Defect Detection [J].Scientific Reports，2022，12（1）：1-9.

[10] SINGH B，DAVIS L S. An Analysis of Scale Invariance in Object Detection-SNIP [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake：IEEE，2018：3578-3587.

[11] LIN T-Y，DOLLÁR P，GIRSHICK R，et al. Feature Pyramid Networks for Object Detection [C]//2017 IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Honolulu： IEEE，2017：936-944.

[12] SINGH B，NAJIBI M，DAVIS L S. SNIPER：Efficient Multi-Scale Training [C]//NIPS'18：Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montréal：Curran Associates，2018：9333-9343.

[13] LI Y H，CHEN Y T，WANG N Y，et al. ScaleAware Trident Networks for Object Detection [C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV）.Seoul：IEEE， 2019：6053-6062.

[14] WANG J W，XU C，YANG W，et al. A Normalized Gaussian Wasserstein Distance for Tiny Object Detection [J/OL].arXiv： 2110.13389 [cs.CV].（2021-10-26）.https://arxiv.org/abs/2110.13389.

[15] HU J，SHEN L，SUN G. Squeeze-and-Excitation Networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake：IEEE，2018：7132-7141.

[16] WOO S，PARK J，LEE J-Y，et al. CBAM：Convolutional Block Attention Module [C]//Proceedings of the European Conference on Computer Vision （ECCV）.Munich：ECCV，2018：3-19.

[17] VASWANI A，SHAZEER N，PARMAR N，et al. Attention is all you Need [C]//NIPS'17：Proceedings of the 31st International Conference on Neural Information Processing Systems.Long Beach： Curran Associates，2017：6000-6010.

[18] WANG K X，LIEW J H，ZOU Y T，et al. PANet：FewShot Image Semantic Segmentation with Prototype Alignment [C]//2019 IEEE/CVF International Conference on Computer Vision（ICCV）. Seoul：IEEE，2019：9196-9205.

[19] LIN T-Y，GOYAL P，GIRSHICK R，et al. Focal Loss for Dense Object Detection [J].IEEE Transactions on Pattern Analysis and Machine Intelligence，2017，42（2）：318-327.

[20] LI K，WAN G，CHENG G，et al. Object Detection in Optical Remote Sensing Images：A Survey and a New Benchmark [J].ISPRS Journal of Photogrammetry and Remote Sensing，2020，159：296-307.

[21] DENG J，DONG W，SOCHER R，et al. ImageNet：A LargeScale Hierarchical Image Database [C]//2009 IEEE Conference on Computer Vision and Pattern Recognition.Miami：IEEE，2009：248-255.

[22] HE K，ZHANG X，REN S，et al. Deep Residual Learning for Image Recognition [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition（CVPR）.Las Vegas：IEEE，2016：770-778.

作者简介：李容光（1997—），男，汉族，四川巴中人，硕士研究生在读，研究方向：小目标检测；通讯作者：杨梦龙（1983—），男，汉族，四川成都人，副研究员，博士研究生，研究方向：计算机视觉，模式识别，图像处理。

上一篇：基于 BIM 的智慧楼宇综合数字管理平台设计

下一篇：基于 LDA 主题模型的 MOOC 课程评论文本分析