当前位置>主页 > 期刊在线 > 信息技术 >

信息技术22年23期

基于深度学习的人体姿态估计方法综述
曹晓瑜,夏端峰
(湖北师范大学 计算机与信息工程学院,湖北 黄石 435002)

摘  要:随着现实世界对人体姿态估计的应用需求越来越高,人体姿态估计研究受到了广泛的关注。文章梳理了近年来基于深度学习的人体姿态估计的研究进展。首先概要介绍人体姿态估计的基本步骤、分类及应用场景、未来研究中有待解决的问题和挑战;然后阐述了人体姿态估计研究的三个基础,包括主体架构、数据集和评估模型的重要指标;最后详细介绍了几个主要的人体姿态估计网络,筛选出对人体姿态估计有重要影响的研究,分析了部分经典模型的架构、工作原理及其实际应用和不足。


关键词:深度学习;单 / 多人体姿态估计;评价指标;数据集



DOI:10.19850/j.cnki.2096-4706.2022.23.001


基金项目:国家自然科学基金(62172144)


中图分类号:TP39                                          文献标识码:A                                      文章编号:2096-4706(2022)23-0001-06


A Review of Human Body Posture Estimation Methods Based on Deep Learning

CAO Xiaoyu, XIA Duanfeng

(School of Computer and Information Engineering, Hubei Normal University, Huangshi 435002, China)

Abstract: With the increasing demand for the application of human body posture estimation in the real world, the research on human body posture estimation has attracted extensive attention. This paper reviews the research progress of human body posture estimation based on deep learning in recent years. Firstly, the basic steps, classification and application scenarios of human body posture estimation, problems to be solved in future research and challenges to be faced are briefly introduced; then, the three foundations of human body posture estimation research are described, including the main body architecture, datasets and important indicators of the evaluation model; finally, several main human body posture estimation networks are introduced in detail, and the research that has important influence on human body posture estimation is screened out. The architecture, working principle, practical application and deficiency of some classical models are analyzed.

Keywords: deep learning; single/multiple human body posture estimation; evaluation metrics; dataset 


参考文献:

[1] ZHANG H B,LEI Q,ZHONG B N,et al. A survey on Human Pose Estimation [J].Intelligent Automation & Soft Computing, 2015,22:483-489.

[2] LIN T Y,DOLLÁR P,GIRSHICK R,et al. Feature pyramid networks for object detection [J/OL].arXiv:1612.03144 [cs.CV].[2022- 07-18].https://arxiv.org/abs/1612.03144.

[3] GUO Y M,LIU Y,OERLEMANS A,et al. Deep learning for visual understanding:A review [J].Neurocomputing,2016,187:27-48.

[4] TOSHEV A,SZEGEDY C. DeepPose:Human Pose Estimation via Deep Neural Networks [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014.https://ieeexplore.ieee.org/document/6909610.

[5] CARREIRA J,AGRAWAL P,FRAGKIADAKI K,et al. Human Pose Estimation With Iterative Error Feedback [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Las Vegas:IEEE,2016:4733-4742.

[6] WEI S E,RAMAKRISHNA V,KANADE T,et al. Convolutional pose machines [EB/OL].[2022-07-18].https://www.doc88. com/p-4894905689906.html.

[7] NEWELL A,YANG K Y,DENG J. Stacked Hourglass Networks for human Pose Estimation [EB/OL].[2022-07-12]. https:// arxiv.org/pdf/1603.06937.pdf.

[8] HE K M,GKIOXARI G,DOLLÁR P,et al. Mask R-CNN [EB/ OL].[2022-07-18].https://openaccess.thecvf.com/content_iccv_2017/html/ He_Mask_R-CNN_ICCV_2017_paper.html.

[9] CAO Z,HIDALGO G,SIMON T,et al. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields [J]. IEEE transactions on pattern analysis and machine intelligence,2021, 43(1):172-186.

[10] PISHCHULIN L,INSAFUTDINOV E,TANG S Y,et al. DeepCut:Joint subset partition and labeling for multi person pose estimation [C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas:IEEE,2016:4929-4937.

[11] INSAFUTDINOV E,PISHCHULIN L,ANDRES B, et al. DeeperCut:A deeper,stronger,and faster multi-person pose estimation model [J/OL]. [2022-07-18].arXiv:1605.03170 [cs.CV]. https://arxiv.org/abs/1605.03170v3.

[12] KRIZHEVSKY A,SUTSKEVER I,HINTON G E. ImageNet classification with deep convolutional neural networks [J]. Communications of the ACM,2017,60(6):84-90.

[13] GIRSHICK R,DONAHUE J,DARRELL T,et al. Rich feature hierarchies for accurate object detection and semantic segmentation [C]//2014 IEEE Conference on Computer Vision and Pattern Recognition.Columbus:IEEE,2014:580-587. 

[14] GIRSHICK R. Fast R-CNN [C]//2015 IEEE International Conference on Computer Vision (ICCV).Santiago:IEEE,2015:1440-1448.

[15] REN S Q,HE K M,GIRSHICK R,et al. Faster R-CNN:Towards real-time object detection with region proposal networks [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.

[16] SIMONYAN K,ZISSERMAN A. Very deep convolutional networks for large-scale image recognition [EB/OL].[2022-07-18]. https://www.doc88.com/p-9337262204881.html. 

[17] HE K M,ZHANG X Y,REN S Q,et al. Deep residual learning forimage recognition [EB/OL].[2022-07-1218].https:// openaccess.thecvf.com/content_cvpr_2016/html/He_Deep_Residual_ Learning_CVPR_2016_paper.html.

[18] GOLDA T,KALB T,SCHUMANN A,et al. Human pose estimation for real-world crowded scenarios [C]//2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).Taipei:IEEE,2019:1-8. [19] SU Z H,YE M,ZHANG G,et al.Cascade feature Aggregation for Human Pose Estimation [J/OL].arXiv:1902.07837 [cs. CV].[2022-07-12].https://arxiv.org/abs/1902.07837.

[20] SUN K,XIAO B,LIU D,et al. Deep high-Resolution Representation Learning for Human Pose Estimation [C]//2019 IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach:IEEE,2019:5686-5696. 

[21] CHEN Y L,WANG Z C,PENG Y X,et al. Cascaded Pyramid Network for Multi-Person Pose Estimation [EB/OL].[2022-07- 12].https://www.doc88.com/p-9874811502477.html.


作者简介:曹晓瑜(1998—),女,汉族,陕西咸阳人,硕士研究生在读,主要研究方向:计算机视觉;通讯作者:夏端峰(1979—),女,汉族,湖北武穴人,高级实验师,硕士生导师,硕士研究生,主要研究方向:计算机应用、软件开发及计算机信息教育。