摘 要:在视频监控场景中,由于车辆自身外观的多样性和相似性以及无约束的监控环境,以致很难通过全局外观特征区分不同的车辆目标。与全局外观特征相比较,局部区域特征更具区分能力。同时,为了兼顾算法的速度,本文提出一种基于区域与全局融合特征的以图搜车算法。该算法分为三个阶段:首先,以车辆 IDs 作为标签信息,训练一个车辆的全局特征网络;其次,加入局部区域特征网络,进而联合训练局部区域特征与全局特征网络;在推理阶段,仅采用全局特征网络的特征计算车辆图像之间的相似度。本文采用视频监控场景的图片作为数据集进行算法测试,结果显示所提出的方法的 Top10 性能达到了 91.3%,特征提取时间与单次特征比对时间分别为 13.8ms 和 0.0016ms,满足了应用需求。
中图分类号:TP391.41 文献标识码:A 文章编号:2096-4706(2019)12-0001-04
A Vehicle Retrieval Algorithm Based on Regional and Global Fusion Feature
ZHAO Qingli,WEN Li,HUANG Yuheng,JIN Xiaofeng,LIANG Tiancai
(Guangzhou GRG Banking Technology Co.,Ltd.,Guangzhou 510006,China)
Abstract:In video surveillance scenario,due to the diversity and similarity of vehicle appearance and unconstrained surveillance environment,it is difficult to distinguish different vehicles by global appearance features. Compared with global appearance features,local region features are more distinctive for vehicle retrieval. At the same time,in order to balance the speed of the algorithm,a vehicle retrieval algorithm based on regional and global fusion feature is proposed in this paper. The algorithm is divided into three stages:firstly,using vehicle IDs as the label to train a vehicle’s global feature network;secondly,adding a local region feature network,and then the local region feature network and the global feature network are jointly trained;in the inference stage,only using global feature network’s features to calculate the similarity between different vehicle images. In this paper,the images of the surveillance video scenario are used as the data set to test the algorithm. The results showed that the performance of Top10 reached 91.3%,and the time of feature extraction and single feature comparison were 13.8ms and 0.0016ms respectively. Therefore,satisfied the application demand.
Keywords:video surveillance;vehicle retrieval;regional and global fusion feature
[1] 刘鑫辰 . 城市视频监控网络中车辆搜索关键技术研究 [D].北京:北京邮电大学,2018.
[2] Lowe D G.Distinctive Image Features from Scale-Invariant Keypoints [J].International Journal of Computer Vision,2004,60(2):91-110.
[3] Liu X,Wu L,Tao M,et al.Deep Learning-Based Approach to Progressive Vehicle Re-identification for Urban Surveillance [C]// European Conference on Computer Vision. Springer,Cham,2016.
[4] Liu H,Tian Y,Wang Y,et al. Deep Relative Distance Learning:Tell the Difference between Similar Vehicles [C]// 2016IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE,2016.
[5] Zhou,Y.,Liu L,Shao,L. Vehicle Re-Identification by Deep Hidden Multi-View Inference [J].IEEE Transactions on Image Processing,2018,27(7):3275-3287.
[6] Wang Z,Tang L,Liu X,et al. Orientation Invariant Feature Embedding and Spatial Temporal Regularization for Vehicle Reidentification [C]// 2017 IEEE International Conference on Computer Vision (ICCV). IEEE,2017.
[7] Liu X,Zhang S,Huang Q,et al. RAM:A Region-Aware Deep Model for Vehicle Re-Identification [C]// 2018 IEEE International Conference on Multimedia and Expo (ICME). IEEE,2018.
[8] Xie S,Ross G,Dollar P,et al. Aggregated Residual Transformations for Deep Neural Networks [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017.
[9] Simonyan K,Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition [J].Computer Science,2014.
[10] He K,Zhang X,Ren S,et al. Deep Residual Learning for Image Recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2016.
[11] Szegedy C,Liu W,Jia Y,et al. Going Deeper with Convolutions [C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2015.
[12] Szegedy C,Vanhoucke V,Ioffe S,et al.Rethinking the Inception Architecture for Computer Vision [J].Computer Science,2015.
[13] Hu J,Shen L,Albanie S,et al. Squeeze-and-Excitation Networks [J].IEEE transactions on pattern analysis and machine intelligence,2019.
[14] Wen Y,Zhang K,Li Z,et al. A Discriminative Feature Learning Approach for Deep Face Recognition [M].Computer Vision–ECCV 2016. Springer International Publishing,2016.
[15] 李熙莹,周智豪,邱铭凯 . 基于部件融合特征的车辆重识别算法 [J/OL]. 计算机工程:1-11.https://doi.org/10.19678/j.issn.1000-3428.0052284,2018-11-30.
[16] Berkeley Artificial Intelligence Research.Caffe is a deep learning framework made with expression [EB/OL].http://caffe.berkeleyvision.org,2019-06-14.