簡易檢索 / 詳目顯示

研究生: 阮德魁
De-Kuei Juan
論文名稱: 長短期記憶神經網路基於不同資料型態輸入之人體動作辨識
Long Short-Term Memory Network Recognizing Human Action Based on Different Data Types
指導教授: 施慶隆
Ching-Long Shih
口試委員: 施慶隆
Ching-Long Shih
王乃堅
Nai-Jian Wang
李文猶
Wen-Yo Lee
吳修明
Hsiu-Ming Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 111
語文別: 中文
論文頁數: 72
中文關鍵詞: 機器學習長短期記憶人體動作辨識人體關鍵點資料型態
外文關鍵詞: Machine Learning, Long Short-Term Memory, Human Action Recognition, Human Body Key Points (Skeleton), Data Type
相關次數: 點閱:221下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文旨在運用不同資料型態輸入相同結構神經網路以比較其對人體動作辨識之影響。為達上述之目的,首先本實驗使用一般不具深度值之 RGB 相機取得影像作為原始資料,並將此原始資料經預處理後製成四種資料型態之資料集,分別是位置資料集、角度資料集以及另外兩種以位置及角度組成之綜合資料集。接著設計長短期記憶神經網路,分別輸入這四種資料集訓練以比較其訓練過程以及最終測試結果。本論文採用五種於中風復健時可能出現之動作,將這五種姿勢設定為標準動作,並各自衍生出以代償方式完成動作目標之錯誤動作。標準動作與代償之錯誤動作具相同動作目標,兩者之差異在於完成動作目標之姿勢略有不同,因此標準動作與代償之錯誤動作可視為不同的動作,故本文共計有十種動作,作為神經網路辨識之目標。


    The aim of this thesis is to compare the effects of the different data type input while the neural network is using the same structure to recognize the action of the person in real time. First, the experiment uses RGB camera which is without the depth value to get the source data. And changing these original source into four datasets of data type after data pre-processing. The four data types are the position data of pixel coordinate, the angle data, and two composite data types that are consisting of position and angle data type. Then, this work designs the long short-term memory network, and input the dataset of four data type respectively to compare the processes of training data and the results of testing data. This thesis has chosen five different actions, which might occur during rehabilitation of stroke patient. The five chosen actions will be seen as the standard, and derived the wrong actions that can accomplish the same mission by compensatory movement. There is the same action goal between standard action and compensatory movement, but using different muscle to accomplish the mission. Therefore, the standard action and the compensatory movement will be seen as different actions in this thesis. As the results, there are ten actions in total will be considered as the target of neural network recognition.

    摘要 I Abstract II 目錄 III 圖目錄 V 表目錄 VII 第1章 緒論 1 1.1 研究動機與目的 1 1.2 文獻回顧 2 1.3 論文大綱 4 第2章 系統架構與任務說明 5 2.1 系統架構 5 2.2 硬體設備 6 2.3 開發環境 7 2.4 任務說明 8 2.5 動作定義 9 第3章 資料蒐集與前處理 12 3.1 蒐集所有動作影片 13 3.2 辨識影片 13 3.3 提取關鍵點 14 3.4 訓練資料集 15 3.4.1 位置資料 16 3.4.2 角度資料 17 3.4.3 綜合資料 18 3.5 辨識系統 19 第4章 長短期記憶神經網路 20 4.1 類神經網路 20 4.2 循環神經網路 23 4.2.1 循環神經網路架構 23 4.3 長短期記憶神經網路 24 4.3.1 長短期記憶神經網路架構 26 4.3.2 長短期記憶神經網路神經元 29 4.4 激勵函數 33 4.5 損失函數及優化器 35 第5章 實驗結果與討論 37 5.1 神經網路資料集 38 5.2 訓練資料集及驗證資料集 39 5.2.1 位置資料 39 5.2.2 角度資料 41 5.2.3 上半身位置資料綜合下半身角度資料 42 5.2.4 上半身角度資料綜合下半身位置資料 43 5.2.5 綜合結果 45 5.3 測試資料 47 5.3.1 使用位置資料之模型測試結果與分析 48 5.3.2 使用角度資料之模型測試結果與分析 50 5.3.3 使用上半身位置資料綜合下半身角度資料之模型測試結果與分析 52 5.3.4 使用上半身角度資料綜合下半身位置資料之模型測試結果與分析 55 5.3.5 綜合四種資料型態之模型結果與分析 56 第6章 結論與建議 58 6.1 結論 58 6.2 建議 59 參考資料 61

    [1] J. Yamato, J. Ohya and K. Ishii, "Recognizing human action in time-sequential images using hidden Markov model," Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992, pp. 379-385.
    [2] C. Schuldt, I. Laptev and B. Caputo, "Recognizing human actions: a local SVM approach," Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., 2004, pp. 32-36.
    [3] T. Liu, Y. Song, Y. Gu and A. Li, "Human Action Recognition Based on Depth Images from Microsoft Kinect," 2013 Fourth Global Congress on Intelligent Systems, 2013, pp. 200-204.
    [4] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788.
    [5] J. Donahue et al., "Long-Term Recurrent Convolutional Networks for Visual Recognition and Description," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 677-691, 1 April 2017.
    [6] X. Wang, L. Gao, J. Song and H. Shen, "Beyond Frame-level CNN: Saliency-Aware 3-D CNN With LSTM for Video Action Recognition," in IEEE Signal Processing Letters, vol. 24, no. 4, pp. 510-514, April 2017.
    [7] T. Z. Wint Cho, M. T. Win and A. Win, "Human Action Recognition System based on Skeleton Data," 2018 IEEE International Conference on Agents (ICA), 2018, pp. 93-98.
    [8] F. Angelini, Z. Fu, Y. Long, L. Shao and S. M. Naqvi, "2D Pose-Based Real-Time Human Action Recognition With Occlusion-Handling," in IEEE Transactions on Multimedia, vol. 22, no. 6, pp. 1433-1446, June 2020.
    [9] G. S. Lahan, A. K. Talukdar and K. K. Sarma, "Action Recognition from Depth Video Sequences Using Microsoft Kinect," 2019 Fifth International Conference on Image Information Processing (ICIIP), 2019, pp. 35-40.
    [10] A. Mihanpour, M. J. Rashti and S. E. Alavi, "Human Action Recognition in Video Using DB-LSTM and ResNet," 2020 6th International Conference on Web Research (ICWR), 2020, pp. 133-138.
    [11] H. Yan, B. Hu, G. Chen and E. Zhengyuan, "Real-Time Continuous Human Rehabilitation Action Recognition using OpenPose and FCN," 2020 3rd International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), 2020, pp. 239-242.
    [12] X. Weiyao, W. Muqing, Z. Min and X. Ting, "Fusion of Skeleton and RGB Features for RGB-D Human Action Recognition," in IEEE Sensors Journal, vol. 21, no. 17, pp. 19157-19164, 1 Sept.1, 2021.
    [13] Y. Cheng and M. Tomizuka, "Long-Term Trajectory Prediction of the Human Hand and Duration Estimation of the Human Action," in IEEE Robotics and Automation Letters, vol. 7, no. 1, pp. 247-254, Jan. 2022.
    [14] J. Ryu, A. K. Patil, B. Chakravarthi, A. Balasubramanyam, S. Park and Y. Chai, "Angular Features-Based Human Action Recognition System for a Real Application With Subtle Unit Actions," in IEEE Access, vol. 10, pp. 9645-9657, 2022.

    QR CODE