摘 要:人体姿态估计作为计算机视觉热门研究领域之一,文章首先分析 2D 人体姿态估计,提出增加深度信息的 3D 人体姿态估计。其次,对当前基于深度学习的 3D 人体姿态估计的研究成果进行阐述,针对单人人体姿态估计和多人人体姿态估计,从单目图像、多目图像两个方向,提出不同模型在估计精度、姿态遮挡等难题方面的解决方案。最后,利用公共数据集对比分析各算法的性能指标并展望其未来发展趋势。
关键词:3D 人体姿态估计;深度学习;关键点估计;估计精度
DOI:10.19850/j.cnki.2096-4706.2023.04.030
基金项目:安徽省自然科学基金面上项目(2208085ME128)
中图分类号:TP391.4 文献标识码:A 文章编号:2096-4706(2023)04-0117-05
Research Review of 3D Human Pose Estimation Based on Deep Learning
HU Jiaqi 1, WANG Chengjun2, YANG Chaoyu2
(1.School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan 232001, China; 2.School of Artificial Intelligence, Anhui University of Science & Technology, Huainan 232001, China)
Abstract: Human pose estimation is one of the hot research fields of computer vision. Firstly, this paper analyzes 2D human pose estimation and proposes 3D human pose estimation with depth information. Secondly, the current research results of 3D human pose estimation based on deep learning are described. For single human pose estimation and multiple human pose estimation, from two directions of monocular image and monocular image, the solutions of different models in estimation accuracy, pose occlusion and other difficulties are proposed. Finally, the performance indicators of each algorithm are compared and analyzed using the common data set and its future development trend is prospected.
Keywords: 3D human pose estimation; deep learning; key point estimation; estimation accuracy
参考文献:
[1] 陈艳,胡荣,李升健,等 . 基于组合特征和 SVM 的视频中人体行为识别算法 [J]. 沈阳工业大学学报,2020,42(6):665-669.
[2] 张小娜,吴庆涛 . 基于深度学习的自顶向下人体姿态估计算法 [J]. 电子测量技术,2021,44(9):105-109.
[3] WANDT B,ROSENHAHN B. RepNet:Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Long Beach:IEEE,2019:7774-7783.
[4] LI S C,KE L,PRATAMA K,et al. Cascaded Deep
Monocular 3D Human Pose Estimation with Evolutionary Training Data [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle:IEEE,2020:6172-6182.
[5] XU J W,YU Z B,NI B B,et al. Deep Kinematics Analysis for Monocular 3D Human Pose Estimation [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle:IEEE,2020:896-905.
[6] XU T H,TAKANO W. Graph Stacked Hourglass Networks for 3D Human Pose Estimation [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Nashville: IEEE,2021:16100-16109.
[7] ROGEZ G,WEINZAEPFEL P,SCHMID C. LCR-Net++:Multi-person 2D and 3D Pose Detection in Natural Images [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(5):1146-1161.
[8] BENZINE A,CHABOT F,LUVISON B,et al. Pandanet: Anchor-based single-shot multi-person 3D pose estimation [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle:IEEE,2020:6855-6864.
[9] LI J F,WANG C,ZHU H,et al. CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark [C]//2019IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Long Beach:IEEE,2019:10855-10864.
[10] FABBRI M,LANZI F,CALDERARA S,et al. Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Seattle:IEEE,2020:7202-7211.
[11] XIE R C,WANG C Y,WANG Y Z. MetaFuse: A Pretrained Fusion Model for Human Pose Estimation [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle:IEEE,2020:13683-13692.
[12] KOCABAS M,KARAGOZ S,AKBAS E. Self-Supervised Learning of 3D Human Pose Using Multi-View Geometry [C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Long Beach:IEEE,2019:1077-1086.
[13] WANDT B,RUDOLPH M,ZELL P,et al. CanonPose: Self-Supervised Monocular 3D Human Pose Estimation in the Wild [C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).Nashville:IEEE,2021:13289-13299.
[14] CHEN L,AI H Z,CHEN R,et al. Cross-view tracking for multi-human 3D pose estimation at over 100 fps [C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle:IEEE,2020:3276-3285.
[15] IONESCU C,PAPAVA D,OLARU V,et al. Ionescu Catalin et al. Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments [J].IEEE transactions on pattern analysis and machine intelligence,2014,36(7):1325-1339.
[16] MEHTA D,RHODIN H,CASAS D,et al. Monocular 3D Human Pose Estimation in the Wild Using Improved CNN Supervision [C]//2017 International Conference on 3D Vision (3DV).Qingdao: IEEE,2017:506-516.
[17] JOO H,SIMON T,LI X L,et al. Panoptic Studio: A Massively Multiview System for Social Interaction Capture [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(1): 190-204.
作者简介:胡佳琪(1997—),女,汉族,天津人,硕士研究生在读,研究方向:计算机视觉;王成军(1978—),男,汉族,江苏涟水人,教授,博士,研究方向:计算机视视觉、智能机械与机器人等;杨超宇(1981—),男,汉族,安徽淮南人,教授,博士,研究方向:计算机视觉、大数据分析与挖掘等。