簡易檢索 / 詳目顯示

研究生: 鄧少鈞
Shao-Jun Deng
論文名稱: 基於卷積神經網路及人體骨架資訊之靜態與動態動作即時追蹤與辨識系統
A Real-time Tracking and Recognition System for Static and Dynamic Human Actions Based on a Convolutional Neural Network and Human Skeleton Information
指導教授: 施慶隆
Ching-Long Shih
口試委員: 施慶隆
Ching-Long Shih
黃志良
Chih-Lyang Hwang
李文猶
Wen-Yo Lee
吳修明
Hsiu-Ming Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 56
中文關鍵詞: 機器學習卷積神經網路人體骨架動作辨識目標追蹤即時監控
外文關鍵詞: Machine Learning, Convolutional Neural Network, Human Skeleton, Motion Recognition, Target Tracking, Real-Time Monitoring System
相關次數: 點閱:249下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文意旨為利用人體關節資訊基於卷積神經網路實現即時性的人體追蹤及動態動作辨識系統。為了達成以上目的,在硬體上使用Kinect相機、直流馬達與FPGA De0-NANO開發板。本系統主要由三個子系統組成,包含相機人體追蹤系統、關節資料處理系統與動作辨識端的機器學習系統。在人體追蹤方面,利用Kinect相機追蹤人體上半身特徵關節點,並根據相對應的距離與角度差計算鏡頭移動指令,經由閉迴路PID控制器,控制直流馬達達成鏡頭追蹤目的。在機器學習端,選用傳統的卷積神經網路搭配預建立的資料集進行訓練,產生權重檔供辨識端使用。資料處理方面,有別於以往圖像輸入改採取資料流方式,對特徵關節點做即時運算並排列成矩陣型態輸入至預訓練好的網路模型中取得辨識結果,再搭配上雙線程與系統狀態機的整合,即可於鏡頭追蹤到人體之極限距離達成即時性動態動作辨識的目的。


    This paper utilizes human skeleton information based convolutional neural network (CNN) to realize a real-time human tracking and dynamic motion recognition system. For the above purpose, a Kinect, a DC motor and a FPGA development board are used to implement the human motion recognition system hardware. This software system consists of three sub-systems, including a camera human tracking system, a skeleton data processing system and a machine learning system. The Kinect camera is responsible for tracking the human upper body’s coordinates in order to calculate the distances and angles of two arms. A PID position controller is applied to control the yaw angle of the Kinect camera to achieve the function of human tracking. A convolutional neural network (CNN) is trained by using a pre-built data set and a weight file is generated for the human motion recognition testing. The CNN’s data source is differ from the traditional image input by a data stream method. It is based on the real-time calculation of the featured joint points in sequence and put into a matrix form. By integration of the two threads system and a state machine, the real-time dynamic human motion recognition can be achieved in a limited distance from the camera.

    摘要 I Abstract II 目錄 III 圖目錄 V 表目錄 VII 第1章 緒論 1 1.1 研究動機與目的 1 1.2 文獻回顧 1 1.3 論文大綱 3 第2章 系統架構介紹 4 2.1 系統架構說明 4 2.2 系統硬體介紹 5 2.3 系統控制流程 6 2.4 直流馬達位置控制 7 2.5 Kinect v2簡介 8 2.5.1 Kinect 硬體規格 8 2.5.2 Kinect SDK開發環境 10 2.5.3 Kinect骨架追蹤技術 10 2.6 通訊格式 12 第3章 人體骨架資訊前處理與資料集 13 3.1 辨識流程 13 3.2 動作與資料定義 14 3.2.1 動作分類列表 14 3.2.2 動作資料定義 15 3.3 訓練資料與標記 17 3.3.1 獨熱編碼(One-Hot-Encoding) 18 3.3.2 資料特徵縮放 19 3.3.3 訓練資料集 19 3.4 指向座標補償實現 20 3.5 即時性辨識系統實現 21 第4章 基於資料流之卷積神經網路 23 4.1 類神經網路 23 4.2 卷積神經網路 24 4.3 卷積神經網路架構 25 4.3.1 卷積層 26 4.3.2 激活函數 26 4.3.3 池化層 27 4.3.4 全連接層 28 4.3.5 損失函數 29 4.3.6 訓練優化方法 29 4.4 網路訓練流程 30 第5章 實驗結果與討論 31 5.1 OpenGL Utility Toolkit人體骨架繪製 32 5.2 鏡頭人體追蹤實驗 33 5.3 卷積神經網路訓練實驗 35 5.4 行向量資料討論 37 5.5 動作預測機率實驗 38 第6章 結論與建議 42 6.1 結論 42 6.2 建議 43 參考文獻 44

    [1] Kinect SDK for windows,[online]Available: https://developer.microsoft.com/zh-tw/windows/kinect/
    [2] E. P. Ijjina and C. K. Mohan, "Human Action Recognition Based on Recognition of Linear Patterns in Action Bank Features Using Convolutional Neural Networks," 2014 13th International Conference on Machine Learning and Applications, Detroit, MI, USA, 2014, pp. 178-182, doi: 10.1109/ICMLA.2014.33.
    [3] B. Hu and J. Wang, "Deep Learning Based Hand Gesture Recognition and UAV Flight Controls," 2018 24th International Conference on Automation and Computing (ICAC), 2018, pp. 1-6, doi: 10.23919/IConAC.2018.8748953.
    [4] Razieh Rastgoo, Kourosh Kiani, Sergio Escalera, "Sign Language Recognition: A Deep Survey, "Expert Systems with Applications,Volume 164,2021,113794,
    ISSN 0957-4174,https://doi.org/10.1016/j.eswa.2020.113794.
    [5] T. Liu, Y. Song, Y. Gu and A. Li, "Human Action Recognition Based on Depth Images from Microsoft Kinect," 2013 Fourth Global Congress on Intelligent Systems, Hong Kong, China, 2013, pp. 200-204, doi: 10.1109/GCIS.2013.38.
    [6] T. Z. Wint Cho, M. T. Win and A. Win, "Human Action Recognition System Based on Skeleton Data," 2018 IEEE International Conference on Agents (ICA), Singapore, 2018, pp. 93-98, doi: 10.1109/AGENTS.2018.8458495.
    [7] J. Yamato, J. Ohya and K. Ishii, "Recognizing Human Action in Time-sequential Images Using Hidden Markov Model," Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Champaign, IL, USA, 1992, pp. 379-385, doi: 10.1109/CVPR.1992.223161.
    [8] C. Schuldt, I. Laptev and B. Caputo, "Recognizing Human Actions: a Local SVM Approach," Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Cambridge, UK, 2004, pp. 32-36 Vol.3, doi: 10.1109/ICPR.2004.1334462.
    [9] X. Wang, L. Gao, J. Song and H. Shen, "Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition," in IEEE Signal Processing Letters, vol. 24, no. 4, pp. 510-514, April 2017, doi: 10.1109/LSP.2016.2611485.
    [10] S. Vantigodi and R. Venkatesh Babu, "Real-time Human Action Recognition From Motion Capture Data," 2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2013, pp. 1-4, doi: 10.1109/NCVPRIPG.2013.6776204.
    [11] S. Gattupalli, "Human Motion Analysis and Vision-Based Articulated Pose Estimation," 2015 International Conference on Healthcare Informatics, 2015, pp. 470-470, doi: 10.1109/ICHI.2015.78.
    [12] N. Chen, Y. Chang, H. Liu, L. Huang and H. Zhang, "Human Pose Recognition Based on Skeleton Fusion from Multiple Kinects," 2018 37th Chinese Control Conference (CCC), 2018, pp. 5228-5232, doi: 10.23919/ChiCC.2018.8483016.
    [13] X. Tong, P. Xu and X. Yan, "Research on Skeleton Animation Motion Data Based on Kinect," 2012 Fifth International Symposium on Computational Intelligence and Design, 2012, pp. 347-350, doi: 10.1109/ISCID.2012.238.
    [14] D. Xu, X. Xiao, X. Wang and J. Wang, "Human Action Recognition Based on Kinect and PSO-SVM by Representing 3D Skeletons as Points in Lie Group," 2016 International Conference on Audio, Language and Image Processing (ICALIP), 2016, pp. 568-573, doi: 10.1109/ICALIP.2016.7846646.
    [15] Y. Choubik and A. Mahmoudi, "Machine Learning for Real Time Poses Classification Using Kinect Skeleton Data," 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV), 2016, pp. 307-311, doi: 10.1109/CGiV.2016.66.
    [16] W. Kuo, C. Kuo, S. Sun, P. Chang, Y. Chen and W. Cheng, "Machine Learning-based Behavior Recognition System for a Basketball Player Using Multiple Kinect cameras," 2016 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), 2016, pp. 1-1, doi: 10.1109/ICMEW.2016.7574661.
    [17] S. Wei, Y. Song and Y. Zhang, "Human Skeleton Tree Recurrent Neural Networks with Joint Relative Motion Feature for Skeletons Based Action Recognition," 2017 IEEE International Conference on Image Processing (ICIP), 2017, pp. 91-95, doi: 10.1109/ICIP.2017.8296249.
    [18] Y. Qiao, Y. Zhao and X. Song, "Dynamic Texture Classification Based on Motion Statistical Feature Matrix," 2013 Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013, pp. 535-538, doi: 10.1109/IIH-MSP.2013.138.
    [19] N. Käse, M. Babaee and G. Rigoll, "Multi-view Human Activity Recognition using Motion Frequency," 2017 IEEE International Conference on Image Processing (ICIP), 2017, pp. 3963-3967, doi: 10.1109/ICIP.2017.8297026.
    [20] F. Monti and C. S. Regazzoni, "Human Action Recognition Using the Motion of Interest Points," 2010 IEEE International Conference on Image Processing, 2010, pp. 709-712, doi: 10.1109/ICIP.2010.5651011.
    [21] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” ICLR, 2015.
    [22] A. Krizhevsky, I. Sutskever and G.E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks", Adv. Neural Inf. Process. Syst., pp. 1-9, 2012.
    [23] J. Liu, G. Wang, L. Duan, K. Abdiyeva and A. C. Kot, "Skeleton-Based Human Action Recognition With Global Context-Aware Attention LSTM Networks," in IEEE Transactions on Image Processing, vol. 27, no. 4, pp. 1586-1599, April 2018, doi: 10.1109/TIP.2017.2785279.
    [24] R. Li, H. Fu, W. Lo, Z. Chi, Z. Song and D. Wen, "Skeleton-Based Action Recognition with Key-Segment Descriptor and Temporal Step Matrix Model," in IEEE Access, vol. 7, pp. 169782-169795, 2019, doi: 10.1109/ACCESS.2019.2954744.

    無法下載圖示 全文公開日期 2024/07/06 (校內網路)
    全文公開日期 2024/07/06 (校外網路)
    全文公開日期 2026/07/06 (國家圖書館:臺灣博碩士論文系統)
    QR CODE