簡易檢索 / 詳目顯示

研究生: 許嘉豐
Richard Sugiarto
論文名稱: 使用具有動態時間扭曲的多個攝像機的3D重建姿勢進行運動學習
Motion Learning using 3D Reconstruction Pose from Multiple Cameras with Dynamic Time Warping
指導教授: 楊傳凱
Chuan-Kai Yang
口試委員: 林伯慎
Bor-Shen Lin
賴源正
Yuan-Cheng Lai
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 53
中文關鍵詞: Human PoseOpenPoseIntrinsic Camera CalibrationExtrinsic Camera CalibrationDynamic Time Warping
外文關鍵詞: Human Pose, OpenPose, Intrinsic Camera Calibration, Extrinsic Camera Calibration, Dynamic Time Warping
相關次數: 點閱:242下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報


Motion Learning is a common task nowadays. Motion Learning can be done by comparing all the joints of a skeleton with those of other skeletons. A pose estimator is used to get each joint’s location. There are many pose estimators such as OpenPose, DensePose, and many more. In this thesis, OpenPose is used as the 2D pose estimator. In general, a 2D pose estimator is not enough to cover all the pose information because the 2D pose estimator only focuses from one viewing direction. To overcome this problem, a multiple-camera approach is used to get more information from multiple views. Camera calibration is needed for each camera because multiple cameras are used. From the calibration, a 3D coordinate can be obtained. Finally, scoring can be done since the pose estimator produces 3D pose information. Dynamic Time Warping (DTW) is used for the scoring calculation. DTW is an algorithm that can measure the similarity between two temporal sequences. DTW also can handle the problem if the two temporal sequences have different speeds or different number of frames.

Abstract I Acknowledgment II Table of Contents III List of Figures V List of Tables VI Chapter 1. Introduction 1 1.1 Background 1 1.2 Contribution 2 1.3 Research Outline 2 Chapter 2. Related Works 3 2.1 Human Pose Estimation 3 2.2 Camera and Calibration 5 2.3 Pose Comparison 5 2.4 Kalman Filter 6 Chapter 3. Proposed System 8 3.1 System Architecture 8 3.2 2D Pose Estimation 9 3.3 Camera Projection Matrix and the calibration 10 3.4 Noise Reduction 15 3.5 Scoring Phase 17 Chapter 4. Experimental Results 20 4.1 Experiments Parameter 20 4.2 Experimental Results 21 Chapter 5. Conclusion and Discussion 41 5.1 Conclusion 41 References 42

[1] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields.” 2019.
[2] R. A. Güler, N. Neverova, and I. Kokkinos, “DensePose: Dense Human Pose Estimation In The Wild,” Accessed: Dec. 25, 2021. [Online]. Available: http://densepose.org.
[3] “Human pose estimation using OpenPose with TensorFlow (Part 1) | by Ale Solano | AR/VR Journey: Augmented & Virtual Reality Magazine.” https://arvrjourney.com/human-pose-estimation-using-openpose-with-tensorflow-part-1-7dd4ca5c8027 (accessed Dec. 24, 2021).
[4] Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields *,” Accessed: Dec. 24, 2021. [Online]. Available: https://youtu.be/pW6nZXeWlGM.
[5] T. Simon, H. Joo, I. Matthews, and Y. Sheikh, “Hand Keypoint Detection in Single Images using Multiview Bootstrapping.”
[6] S.-E. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, “Convolutional Pose Machines.”
[7] A. Nibali, Z. He, S. Morgan, and L. Prendergast, “3D Human Pose Estimation with 2D Marginal Heatmaps.” 2018.
[8] D. Mehta et al., “XNect,” ACM Trans. Graph., vol. 39, no. 4, Jul. 2020, doi: 10.1145/3386569.3392410.
[9] “GitHub - mehtadushy/SelecSLS-Pytorch: Reference ImageNet implementation of SelecSLS CNN architecture proposed in the SIGGRAPH 2020 paper ‘XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera’. The repository also includes code for pruning the model based on implicit sparsity emerging from adaptive gradient descent methods, as detailed in the CVPR 2019 paper ‘On implicit filter level sparsity in Convolutional Neural Networks’.” https://github.com/mehtadushy/SelecSLS-Pytorch (accessed Dec. 24, 2021).
[10] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[11] O. Wasenmüller and D. Stricker, “Comparison of Kinect v1 and v2 Depth Images in Terms of Accuracy and Precision.”
[12] Y. M. Wang, Y. Li, and J. B. Zheng, “A camera calibration technique based on OpenCV,” in The 3rd International Conference on Information Sciences and Interaction Sciences, 2010, pp. 403–406, doi: 10.1109/ICICIS.2010.5534797.
[13] Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 11, pp. 1330–1334, 2000, doi: 10.1109/34.888718.
[14] R. R. Romeošajina, M. Ivaši´, and I. Kos, “Pose estimation, tracking and comparison.”
[15] P. Senin, “Dynamic Time Warping Algorithm Review,” 2009.
[16] B. Costa et al., “Fault Classification on Transmission Lines Using KNN-DTW,” 2017, pp. 174–187, doi: 10.1007/978-3-319-62392-4_13.
[17] R. E. Kalman, “A New Approach to Linear Filtering and Prediction Problems,” J. Basic Eng., vol. 82, no. 1, p. 35, 1960, doi: 10.1115/1.3662552.
[18] Y. Salih and A. S. Malik, “3D object tracking using three Kalman filters,” in 2011 IEEE Symposium on Computers Informatics, 2011, pp. 501–505, doi: 10.1109/ISCI.2011.5958966.
[19] “GitHub - CMU-Perceptual-Computing-Lab/openpose: OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation.” https://github.com/CMU-Perceptual-Computing-Lab/openpose (accessed Dec. 04, 2021).
[20] “OpenPose: OpenPose Doc - Output.” https://cmu-perceptual-computing-lab.github.io/openpose/web/html/doc/md_doc_02_output.html (accessed Dec. 24, 2021).
[21] “Understanding Camera Calibration – PERPETUAL ENIGMA.” https://prateekvjoshi.com/2014/05/31/understanding-camera-calibration/ (accessed Jan. 15, 2022).
[22] “Blackfly S USB3 | Teledyne FLIR.” https://www.flir.com/products/blackfly-s-usb3/?model=BFS-U3-13Y3C-C (accessed Dec. 24, 2021).

QR CODE