摘 要:与常规场景相比,遥感场景目标检测任务存在图像尺寸大、小目标数量多、检测框有旋转角等难点,这些难点也使得遥感图像中物体间有更多的关系可挖掘。为提升遥感场景下对旋转目标的检测效果,通过添加关系挖掘模块对旋转目标检测算法(Oriented R-CNN for Object Detection, ORCN)进行优化。关系挖掘模块利用动态图神经网络、交叉注意力机制使候选区域的特征、形状信息进行有效交互,丰富候选区域特征的上下文语义。实验结果表明,添加关系挖掘模块后模型在遥感数据集上的 DOTA 表现提升 1.53%,明显优于原检测算法。
关键词:旋转目标检测;遥感图像;图神经网络;交叉注意力机制
DOI:10.19850/j.cnki.2096-4706.2023.07.019
中图分类号:TP391.4 文献标识码:A 文章编号:2096-4706(2023)07-0074-05
Research on Rotating Target Detection Algorithm Based on Relation Mining in Remote Sensing Scene
XIAO Yang, LI Wei
(School of Aerospace Science and Engineering, Sichuan University, Chengdu 610065, China)
Abstract: Compared with the conventional scene, the target detection task in remote sensing scene has difficulties such as large image size, large number of small targets, and detection frame with rotation angle. These difficulties also make more relationships between objects in remote sensing images can be mined. In order to improve the detection effect of rotating targets in remote sensing scenes, the Oriented R-CNN for Object Detection (ORCN) algorithm is optimized by adding a relationship mining module. The relationship mining module uses dynamic graph neural network and cross-attention mechanism to effectively interact the features and shape information of candidate regions, and enrich the context semantics of the features of candidate regions. The experimental results show that the DOTA performance of the model on the remote sensing data set is improved by 1.53% after adding the relationship mining module, which is significantly better than the original detection algorithm.
Keywords: rotating target detection; remote sensing image; graph neural network; cross-attention mechanism
参考文献:
[1] XIA G S,BAI X,DING J,et al. DOTA: A large-scale dataset for object detection in aerial images [J/OL].arXiv:1711.10398 [cs. CV].[2022-12-23].https://arxiv.org/abs/1711.10398v2 .
[2] DING J,XUE N,LONG Y,et al. Learning RoI transformer for oriented object detection in aerial images [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach:IEEE:2019:2844-2853.
[3] HAN J M,DING J,LI J,et al. Align deep features for oriented object detection [J/OL].arXiv:2008.09397 [cs.CV].[2022-12-25]. https://arxiv.org/abs/2008.09397.
[4] MA J,SHAO W,YE H,et al. Arbitrary-oriented scene text detection via rotation proposals [J].IEEE Transactions on Multimedia, 2018,20(11):3111-3122.
[5] YANG X,YAN J C,FENG Z M,et al. R3det: Refined single-stage detector with feature refinement for rotating object [J/ OL].arXiv:1908.05612 [cs.CV].[2022-12-16].https://arxiv.org/ abs/1908.05612v6.
[6] CHEN X L,Li L J,LI F F,et al. Iterative visual reasoning beyond convolutions [J/OL].arXiv:1803.11189 [cs.CV].[2022-12-23]. https://arxiv.org/abs/1803.11189.
[7] DAI B,ZHANG Y Q,LIN D H. Detecting Visual Relationships with Deep Relational Networks [J/OL].arXiv:1704.03114 [cs.CV].[2022-12-25].https://arxiv.org/abs/1704.03114v2.
[8] AKATA Z,PERRONNIN F,HARCHAOUI Z,et al. Labelembedding for Attribute-Based Classification [C]//2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland:IEEE,2013: 819-826.
[9] ALMAZÁN J,GORDO A,FORNÉS A,et al. Word spotting and recognition with embedded attributes [J].IEEE transactions on pattern analysis and machine intelligence,2014,36(12):2552-2566.
[10] LAMPERT C H,NICKISCH H,HARMELING S. Learning to detect unseen object classes by between-class attribute transfer [C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR).Miami:IEEE,2009:951-958.
[11] MISRA I,GUPTA A,HEBERT M. From red wine to red tomato: Composition with context [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu:IEEE, 2017:1160-1169.
[12] JIANG C H,XU H,LIANG X D,et al. Hybrid knowledge routed modules for large-scale object detection [J/OL].arXiv:1810.12681 [cs.CV].[2022-12-15].https://arxiv.org/abs/1810.12681.
[13] LIU Y,WANG R P,SHAN S G,et al. Structure inference net: Object detection using scene-level context and instance-level relationships [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE,2018:6985-6994.
[14] CHEN X L,GUPTA A. Spatial memory for context reasoning in object detection [C]// 2017 IEEE International Conference on Computer Vision (ICCV). Venice:IEEE,2017:4106-4116.
[15] HU H,GU J Y,ZHANG Z,et al. Relation networks for object detection [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE,2018:3588-3597.
[16] WANG X L,GIRSHICK R,GUPTA A,et al. Non-local neural networks [C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE,2018:7794-7803.
[17] FU K,LI J,MA L,et al. Intrinsic relationship reasoning for small object detection [J/OL].arXiv:2009.00833 [cs.CV].[2022-12-18].https://arxiv.org/abs/2009.00833v1.
[18] XU H,JIANG C H,LIANG X D,et al. Spatial-Aware Graph Relation Network for Large-Scale Object Detection [C]//2019IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach:IEEE,2019:9290-9299.
[19] ZHOU Z P,Li X C. Graph convolution: a high-order and adaptive approach [J/OL].arXiv:1706.09916v2 [cs.LG].[2022-12-10].https://arxiv.org/pdf/1706.09916v2.pdf.
[20] LI G H,MULLER M,THABET A,et al. Deepgcns: Can gcns go as deep as cnns?[C]//HYPERLINK "https://ieeexplore.ieee.org/xpl/ conhome/8972782/proceeding"2019 IEEE/CVF International Conference on Computer Vision (ICCV).Seoul:IEEE,2019:9266-9275.
[21] XIE X X,CHENG G,WANG J B,et al. Oriented R-CNN for object detection [C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV).Montreal:IEEE,2021:3500-3509.
[22] QIAN W,YANG X,PENG S L,et al. RSDet++: Pointbased modulated loss for more accurate rotated object detection [J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(11):7869-7879.
[23] LI W T,CHEN Y J,HU K X,et al. Oriented reppoints for aerial object detection [C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans:IEEE,2022:1819-1828.
[24] YANG X,YAN J C,LIAO W L,et al. Scrdet++: Detecting small, cluttered and rotated objects via instance-level feature denoising and rotation loss smoothing [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(2):2384-2399.
作者简介:肖阳(1998—),男,汉族,辽宁铁岭人,硕士研究生在读,研究方向:航空场景下小目标检测算法;李炜(1990—),男,汉族,四川成都人,副研究员,博士,研究方向:视频图像处理技术、传感器网络技术、视频网络应用技术。