簡易檢索 / 詳目顯示

研究生: Shiela Mecha Cabahug
Shiela Mecha Cabahug
論文名稱: 基於人體姿勢估計與機器學習的羽球球種分類
Badminton Strokes Classification Based on Human Pose Estimation and Machine Learning
指導教授: 林淵翔
Yuan-Hsiang Lin
口試委員: 林淵翔
Yuan-Hsiang Lin
阮聖彰
Shang-Jang Ruan
林昌鴻
Chang-Hong Lin
林敬舜
Ching-Shun Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 95
中文關鍵詞: 動作識別羽球擊球動作檢測姿態估計球種分類即時
外文關鍵詞: action recognition, badminton, hit action detection, pose estimation, stroke classification, real-time
相關次數: 點閱:310下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 對教練和球員來說,識別羽球擊球動作對於評估技術技能和賽中表現分析至關重要。然而,純粹依靠人類的視覺和判斷是一項艱鉅的任務。本研究目標在使用人體姿態估計和機器學習去訓練個人和通用的羽球擊球分類模型。
    在本研究中,個人模型訓練了 11 個受試者的個人分類模型,而通用模型則以 10 個受試者的數據去做訓練和測試。這二個分類模型的數據特徵由 30 幀的 25 個身體姿態座標點組成。用來訓練分類模型的三種機器學習演算法分別是CNN、LSTM和CNN-LSTM。個人模型在 440 個數據樣本上使用 75:25 比例的訓練-測試拆分數據驗證集,具有 11 個分類動作的3 個模型的平均準確率分別為94.32%、89.88% 和 89.51%。
    另一方面,通用模型使用 7 個受試者的數據作為訓練數據,3 個受試者的數據作為測試數據,分別對 11 個和 9 個動作進行分類。 CNN、LSTM 和 CNN-LSTM 演算法在具有 11 個動作分類的通用模型上的平均準確率分別為 83%、82% 和 74%。然而,在具有 9 個動作分類的通用模型的平均準確率分別為 90%、88% 和 87%。
    對 11 個和 9 個分類動作訓練好的 CNN 通用模型,準確率分別為 83% 和 90%,是用於預測 3 個測試對象的即時動作分類模型。我們使用即時擊球動作偵測算法去提取即時數據,在11個和9個分類動作的擊球偵測準確率分別為 89.27% 和 92.11%。而對於 11 個和 9 個動作分類,訓練模型的即時平均準確率分別為 72.67% 和 75.00%。


    For coaches and players, identifying badminton stroke is crucial for evaluating technical skills and in-play performance analysis. However, relying purely on human visuals and judgment is a strenuous task. This study aims to train a personal and general badminton stroke classification model based on human pose estimation and machine learning.
    In this study, the personal model trains a personal classification model for 11 subjects, while the general model is trained and tested on data from 10 subjects. The two classification model’s data feature consists of the 25-body pose model of 30 frames. The three machine learning algorithms are CNN, LSTM, and CNN-LSTM selected to train the classification model. The personal model uses a 75:25 ratio train-test split data validation set on 440 data samples, and the average accuracies of the 3-personal models with 11 classified actions are 94.32%, 89.88%, and 89.51%, respectively.
    On the other hand, the general model classified 11 and 9 actions using 7 subjects’ data as train data and 3 subjects’ data as test data. The CNN, LSTM, and CNN-LSTM algorithms’ average accuracies on the general model with 11 actions are 83%, 82%, and 74%, respectively. However, the average accuracies of the general model with 9 actions are 90%, 88%, and 87%, respectively.
    The trained CNN general model with 83% and 90% accuracy for 11 and 9 classified actions is the model used to predict real-time actions of the 3 test subjects. The real-time data is extracted based on the real-time hit action detection algorithm with 89.27% and 92.11% accuracy for 11 and 9 classified actions. The real-time average accuracy of the trained classification model is 72.67% and 75.00% for 11 and 9 actions, respectively.

    Acknowledgment 6 摘要 7 Abstract 8 List of Figures 11 List of Tables 14 1. Introduction 15 1.1. Motivation and Objectives 15 1.2. Review of Related Works 16 1.3. Thesis Overview 20 2. Methods 21 2.1. System Architecture 21 2.2. Steps in Machine Learning 22 2.3. Dataset Preparation 23 2.3.1. Data Collection 23 2.3.2. Data Labeling 25 2.3.3. Data Feature Extraction 31 2.3.4. Hit Action Localization 32 2.3.5. Hit Action Segmentation 36 2.3.6. Data Feature Selection 36 2.4. Badminton Action Classification Model 38 2.4.1. Convolutional Neural Network (CNN) Architecture 38 2.4.2. Long-Short Term Memory (LSTM) Architecture 40 2.4.3. Combined CNN and LSTM Architecture 41 2.4.4. Model Evaluation Methods 43 2.4.5. Model Validation Techniques 44 2.5. Real-Time Badminton Action Recognition System 50 2.5.1. Badminton Court Extraction 51 2.5.2. OpenPose Thread 54 2.5.3. Hit Action Detection Algorithm 55 2.5.4. Sliding Window 59 2.5.5. Action Prediction 60 2.5.6. Player Location 61 2.5.7. WebSocket Server 62 2.5.8. UI Application 63 3. Results 65 3.1. Personal Model 65 3.2. General Model 67 3.2.1. Offline Dataset 68 3.2.2. Real-Time Dataset 72 3.2.2.1. Hit Action Detection Rate 72 3.2.2.2. Badminton Action Recognition Accuracy 74 4. Discussion 81 4.1. Hit Action Detection Algorithm 81 4.2. Badminton Action Classification Model 83 4.3. Badminton Actions Performance 84 4.4. NBRP Analysis 87 4.5. Comparison with other Related Works 89 5. Conclusion and Future Works 92 References 93

    [1] A. Lees, “Science and the Major Racket Sport: A Review”, J. Sports Sci., vol. 21, pp. 707-732, 2003.
    [2] M. Phomsoupha, G. Laffaye, “The Science of Badminton: Game Characteristics, Anthropometry, Physiology, Visual Fitness and Biomechanics”, Sports Med., vol 45, pp. 473–495, 2015, DOI: https://doi.org/10.1007/s40279-014-0287-2
    [3] “National Badminton Rating Program (NBRP)”, Chinese Hope Badminton Association. [Online]. Accessed: Jan. 05, 2022, Available: http://www.badmintonnotes.com/tw/Home/grades
    [4] D. Cabello-Manrique, J. J. González-Badillo, “Analysis of the Characteristics of Competitive Badminton” Br. J. Sports Med., vol. 37, pp. 62-66, 2003.
    [5] T. S. Chou, “A Study of the Leadership Role-Identity and Self-Expectation of the National Team Badminton Coach in Taiwan”, University of the Incarnate Word. ProQuest Dissertations Publishing, 2002.
    [6] A. D. Butterworth, D. J. Turner, J. A. Johnstone, “Coaches’ Perceptions of the Potential Use of Performance Analysis in Badminton,” Int. J. Perform. Anal. Sport., vol. 12, pp. 452-467, 2012, DOI: https://doi.org/10.1080/24748668.2012.11868610
    [7] A. Krizhevsky, I. Sutskever, G. E. Hinton, “Imagenet Classification with Deep Convolutional Neural Networks,” Adv. Neural Inf. Process. Syst., 2012.
    [8] G. Yao, T. Lei, J. Zhong, “A Review of Convolutional-Neural-Network-Based Action Recognition,” Pattern Recognit. Lett., vol. 118, pp. 14–22, 2019, DOI: https://doi.org/10.1016/j.patrec.2018.05.018
    [9] R. Mutegeki, D. S. Han, “A CNN-LSTM Approach to Human Activity Recognition,” in 2020 ICAIIC, Fukuoka, Japan, February 19-21, 2020, pp. 362–366, DOI: https://doi.org/10.1109/ICAIIC48513.2020.9065078
    [10] J. C. Nunez, R. Cabido, J. J. Pantrigo, A. S. Montemayor, J. F. Velez, “Convolutional Neural Networks and Long Short-Term Memory for Skeleton-Based Human Activity and Hand Gesture Recognition,” Pattern Recognit., vol. 76, pp. 80-94, DOI: 10.1016/j.patcog.2017.10.033
    [11] R. Cui, A. Zhu, G. Hua, H. S. Yin, H. Q. Liu, “Multisource Learning for Skeleton-Based Action Recognition Using Deep LSTM and CNN,” J. Electron. Imaging, vol. 27, 2019, DOI: https://doi.org/10.1117/1.jei.27.4.043050
    [12] G. Ercolano, S. Rossi, “Combining CNN and LSTM for Activity of Daily Living Recognition with a 3D Matrix Skeleton Representation,” Intell. Serv. Robot., vol. 12, pp. 175-185, 2021, DOI: 10.1007/s11370-021-00358-7
    [13] W. T. Chu, S. Situmeang, “Badminton Video Analysis Based on Spatiotemporal and Stroke Features,” in Proc. ICMR’17, Bucharest, Romania, June 6-9, 2017, pp. 448–451. DOI: https://doi.org/10.1145/3078971.3079032
    [14] L. W. Sun, “Research on Classification and Recognition of Badminton Batting Action Based on Machine Learning,” J. Intell. Fuzzy Syst., vol. 37, pp. 6241-6252, 2019, DOI: https://doi.org/10.3233/jifs-179206
    [15] N. A. Rahmad et al., “Recognition of Badminton Action Using Convolutional Neural Network,” Indones. J. Electr. Eng. Inform., vol. 7, pp. 750-756, DOI: https://doi.org/10.11591/ijeei.v7i4.968
    [16] M. Ziaeefard, R. Bergevin, “Semantic Human Activity Recognition: A Literature Review,” Pattern Recognit., vol. 48, pp. 2329-2345, 2015, DOI: https://doi.org/10.1016/j.patcog.2015.03.006
    [17] L. C. Song, Y. Gang, J. S. Yuan, Z. C. Liu, “Human Pose Estimation and Its Application to Action Recognition: A Survey,” J. Vis. Commun. Image Represent., vol. 75, 2021, DOI: https://doi.org/10.1016/j.jvcir.2021.103055
    [18] Z. Cao, G. Hidalgo, T. Simon, S. E. Wei, Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, pp. 172-186, 2019, DOI: https://doi.org/10.1109/TPAMI.2019.2929257
    [19] H. Y. Ting, K. S. Sim, F. S. Abas, “Automatic Badminton Action Recognition Using RGB-D Sensor,” Adv. Mat. Res., vol. 1042, pp. 89-93, 2016, DOI: https://doi.org/10.4028/www.scientific.net/amr.1042.89
    [20] N. Promrit, S. Waijanya, “Model for Practice Badminton Basic Skills by Using Motion Posture Detection from Video Posture Embedding and One-Shot Learning Technique,” in Proc. AICCC 2019, Kobe, Japan, December 21-23, 2019, pp.117–124, DOI: https://doi.org/10.1145/3375959.3375981
    [21] Y. Qi, “Research on Badminton Action Feature Recognition Based on Improved HMM Model,” J. Intell. Fuzzy Syst., vol. 39, pp. 5571-5582, 2020, DOI: https://doi.org/10.3233/jifs-189038
    [22] Y. W. Li, S.W. Jiang. “Video Analysis Technology and Its Application in Badminton Sports Training,” J. of Phys: Conf. Ser., vol. 1213, 2019, DOI: 10.1088/1742-6596/1213/2/022009
    [23] H. Y. Ting, Y. W. D. Tan, B. Y. S. Lau, “Potential and Limitations of Kinect for Badminton Performance Analysis and Profiling,” Indian J. Sci. Technol., vol. 9, 2016, DOI: 10.17485/ijst/2016/v9i45/106910
    [24] K.C. Lin, C.W. Ko, H.C. Hung, N.S. Chen, “The Effect of Real-Time Pose Recognition on Badminton Learning Performance,” Interact. Learn. Environ., 2021, DOI: https://doi.org/10.1080/10494820.2021.1981396
    [25] B. Alamar, “Introduction to Sports Analytics,” in Sports Analytics: A Guide for Coaches, Managers, and Other Decision Makers, New York Chichester, West Sussex: Columbia University Press, 2013, pp. 4-5, DOI: https://doi.org/10.7312/alam16292
    [26] “Badminton Special Olympics Coaching Guide 2014”, pp. 23–25. [Online]. Accessed: Jan. 05, 2022, Available: http://www.soct.org.tw/uploads/1530674069847JgtVZtrs.pdf
    [27] F. Pedregosa, G. Varoquaux, A. Gramfort, et al., “Scikit-Learn: Machine Learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.
    [28] E. W. Weisstein, “L^2-Norm.” [Online]. Accessed. Feb. 02, 2022, Available: https://mathworld.wolfram.com/L2-Norm.html
    [29] K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, J. Schmidhuber, “LSTM: A Search Space Odyssey,” IEEE Trans. Neural Netw. Learn. Syst., vol. 28, pp. 2222-2232, DOI: 10.1109/TNNLS.2016.2582924
    [30] A. Ullah, J. Ahmad, K. Muhammad, M. Sajjad, S. W. Baik, “Action Recognition in Video Sequences Using Deep Bi-Directional LSTM with CNN Features,” IEEE Access, vol. 6, pp. 1155-1166, 2018, DOI: 10.1109/ACCESS.2017.2778011
    [31] Y. Yoshikawa, H. Shishido, M. Suita, Y. Kameda, I. Kitahara, “Shot Detection Using Skeleton Position in Badminton Videos,” in IWAIT 2021, Mar. 03, 2021, vol. 11766, DOI: https://doi.org/10.1117/12.2590407

    QR CODE