簡易檢索 / 詳目顯示

研究生: 曾柏堯
Bor-Yao Tseng
論文名稱: 基於深度學習模型於自動化生成棒球投球影片之系統設計與應用
Automated Baseball Pitching Video Generation System and Application Using Deep Learning Model
指導教授: 陳俊良
Jiann-Liang Chen
口試委員: 陳俊良
黃能富
趙涵捷
郭斯彥
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2024
畢業學年度: 112
語文別: 英文
論文頁數: 82
中文關鍵詞: 棒球賽事機器學習影像分析投球影像數據投球辨識
外文關鍵詞: Baseball Game, Machine Learning, Image Analysis, Pitching Image Data, Pitching Recognition
相關次數: 點閱:179下載:7
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,隨著全球對熱門運動領域的關注和投入不斷增加,相關的運動分析研究也蓬勃發展,這些研究通過運用先進的數據分析技術,揭示了運動員表現的關鍵因素,並對運動策略和訓練方法進行了深度探討,為運動項目的發展提供了強而有力的支持。現代運動以棒球為例,數據分析已經成為球隊決策的重要依據,通過分析球員的打擊率、投球速度、揮棒軌跡等數據,教練和數據分析師能夠精確地評估球員的狀態和潛力,這不僅有助於制定更有效的訓練計劃,還能根據對手的弱點制定出針對性的比賽策略,幫助教練團隊在比賽中做出即時的戰術調整,對整個球隊的戰術和比賽結果有著深遠的影響。
然而在運動數據處理上需要大量人力,這已經成為現代體育運動中不可忽視的一個重點,隨著數據技術的進步,運動數據的資料量不斷增加,在對棒球賽事中的投球影像進行分析時,需要從平均長達2小時40分鐘的比賽中整理出平均300個投球影像,其中數據的收集過程需要大量人力投入。
本研究為了解決投球影像數據收集過程需要大量人力的問題,提出了一個新穎的基於棒球賽事鏡頭的投球辨識方法,可將任意長度的棒球賽事影像作為輸入,系統會自動對影像內容進行分析,生成多個時間長度為3秒的投球影像,其對應輸入影像中所有的投球畫面。
本研究使用AI圖像辨識模型對輸入的棒球賽事影像進行初步辨識,提取影像中棒球座標與球員姿態資訊作為特徵,以本研究所設計之演算法計算出棒球移動軌跡方向與球員站位,自動化辨識輸入影像中的投球時間點,生成對應之投球影像資料集。本研究蒐集MLB與KBO賽事總計長達6小時57分鐘的影像做為測試資料集,precision可達到96.39%,recall可達到94.55%。


In recent years, with the increasing global attention and investment in popular sports domains, related sports analysis research has also flourished. These studies, utilizing advanced data analysis techniques, have unveiled key factors influencing athlete performance and have deeply explored sports strategies and training methods, thereby providing robust support for various sports development. For example, data analysis has become a critical basis for team decision-making in modern sports such as baseball. By analyzing batting averages, pitching speeds, and swing trajectories, coaches and data analysts can accurately assess players' conditions and potential. This not only aids in formulating more effective training plans but also enables the development of targeted game strategies based on opponents' weaknesses. Consequently, coaching teams can make real-time tactical adjustments during games, profoundly impacting the overall strategy and outcomes of the team.
However, processing sports data necessitates substantial human resources, which has become a significant focus in modern sports. With advancements in data technology, the volume of sports data continues to increase. For instance, analyzing pitching footage from baseball games requires extracting an average of 300 pitching images from a game that typically lasts 2 hours and 40 minutes. The data collection process in this context demands considerable human labor.
This study proposes a novel method based on baseball game footage to address the labor-intensive process of collecting pitching image data. The method accepts baseball game footage of any length as input. The system automatically analyzes the content of the footage and generates multiple 3-second pitching images corresponding to all pitching scenes within the input footage.
This study employs an AI image recognition model to conduct initial identification of baseball game footage inputs. It extracts baseball coordinates and player posture information from the images as features. An algorithm designed in this study computes the trajectory direction of the baseball and player positions, facilitating automated identification of pitching timestamps within the input footage. Consequently, it generates corresponding pitching image datasets. For testing, this research collected a total of 6 hours and 57 minutes of MLB and KBO game footage, achieving a precision of 96.39% and a recall of 94.55%.

摘要 I Abstract II List of Figures VII List of Tables IX Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Contributions 7 1.3 Organization 9 Chapter 2 Related Work 10 2.1 Baseball Sport Data Analysis 10 2.2 AI Image Recognition 13 2.3 AI Human Pose Estimation 17 2.4 AI Applications in Baseball 19 2.5 Baseball Image Activity Recognition 22 Chapter 3 Proposed System 25 3.1 System Architecture 25 3.2 Data Collection 26 3.2.1 MLB-YouTube Dataset 26 3.2.2 KBO YouTube 28 3.3 Data Processing 28 3.4 Detection Model 29 3.4.1 Baseball Detection Model 29 3.4.2 Baseball Detection Model 33 3.5 Video Generation 34 3.5.1 Ball Trajectory Direction Analysis 35 3.5.2 Pitching Camera Analysis 38 3.5.2.1 Identify Pitcher 39 3.5.2.2 Identify Catcher and Batter 41 3.5.2.3 Pitching Video Generation 44 Chapter 4 Performance Analysis 46 4.1 System Environment 46 4.1.1 Experiment Environment 46 4.1.2 Experiment Parameter 47 4.2 Performance Evaluation Metrics 49 4.3 Performance Analysis 51 4.3.1 System Performance 51 4.3.2 Failed Sample Analysis 55 4.4 Comparison to Other Studies 56 Chapter 5 Conclusions and Future Works 60 5.1 Conclusions 60 5.2 Future Works 61 References 64

[1] Statista, “Major League Baseball average franchise value from 2002 to 2024” , Retrieved from https://www.statista.com/statistics/193441/average-franchise-value-in-mlb-since-2000/#statisticContainer
[2] Sabr, “A Guide to Sabermetric Research”, Retrieved from https://sabr.org/sabermetrics
[3] ELITZUR Ramy., “Data analytics effects in major league baseball,” Omega, vol. 90, pp. 102001, 2020
[4] G. Healey, "The New Moneyball: How Ballpark Sensors Are Changing Baseball," Proceedings of the IEEE, vol. 105, no. 11, pp. 1999-2002, 2017
[5] Sportsv, 「肉眼看不見的棒球—認識大聯盟數據分析系統」, Retrieved from https://www.sportsv.net/articles/60638
[6] CHASE, Christina, “The data revolution: Cloud computing, artificial intelligence, and machine learning in the future of sports,” Proceedings of the 21st century sports: How technologies will change sports in the digital age, pp. 175-189, 2020
[7] MIZELS Joshua, ERICKSON Brandon and CHALMERS Peter, “Current state of data and analytics research in baseball,” Current reviews in musculoskeletal medicine, vol. 15, pp. 283-290, 2022
[8] Healey, Glenn, "Combining Radar and Optical Sensor Data to Measure Player Value in Baseball" Sensors, vol. 21, iss. 1, pp. 64, 2021
[9] MLB, “What is sabermetrics?”, Retrieved from https://www.mlb.com/news/sabermetrics-in-baseball-a-casual-fans-guide
[10] Sabr, “F. C. Lane”, Retrieved from https://sabr.org/bioproj/person/f-c-lane/
[11] COSTA Gabriel B., HUBER Michael R., SACCOMAN John T., “Understanding sabermetrics: An introduction to the science of baseball statistics,” McFarland, 2019.
[12] M. Lage, J. P. Ono, D. Cervone, J. Chiang, C. Dietrich and C. T. Silva, "StatCast Dashboard: Exploration of Spatiotemporal Baseball Data," Proceedings of the IEEE Computer Graphics and Applications, vol. 36, no. 5, pp. 28-37, Sept.-Oct. 2016.
[13] Huang, J.-H., and Hsu, Y.-C., “A Multidisciplinary Perspective on Publicly Available Sports Data in the Era of Big Data: A Scoping Review of the Literature on Major League Baseball,” Sage Open, vol. 11, pp. 4, 2021.
[14] R. Chauhan, K. K. Ghanshala and R. C. Joshi, "Convolutional Neural Network (CNN) for Image Detection and Recognition," Proceedings of the First International Conference on Secure Cyber Computing and Communication (ICSCCC), Jalandhar, India, pp. 278-282, 2018.
[15] Z. Li, F. Liu, W. Yang, S. Peng and J. Zhou, "A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects," Proceedings of the IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 6999-7019, Dec. 2022.
[16] KOONCE Brett and KOONCE Brett., “ResNet 50,” Convolutional neural networks with swift for tensorflow: image recognition and dataset categorization, pp. 63-72, 2021.
[17] K. He, X. Zhang, S. Ren and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[18] A. Krizhevsky, “Learning multiple layers of features from tiny images Technical Report,” University of Toronto, 2009.
[19] K. He, X. Zhang, S. Ren and J. Sun , “Identity mappings in deep residual networks,” Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, pp. 630-645, 2016.
[20] C. Szegedy, S. Ioffe, V. Vanhoucke and A. Alemi , “Inception-v4, Inception-Resnet and the impact of residual connections on learning,” Proceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, pp. 4278-4284, 2017.
[21] H. Zhao, J. Shi, X. Qi, X. Wang and J. Jia, “Pyramid scene parsing network,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6230-6239, 2017.
[22] Srikanth Tammina, “Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images,” Proceedings of the International Journal of Scientific and Research Publications (IJSRP), vol. 9, iss. 10, pp. 143-150 , 2019.
[23] SUSHMA L. and LAKSHMI K. P.,” An Analysis of Convolution Neural Network for Image Classification using Different Models,” Proceedings of the International Journal of Engineering Research & Technology (IJERT), vol. 9, iss. 10, 2020.
[24] BASHA SH Shabbeer, et al.,” Impact of fully connected layers on performance of convolutional neural networks for image classification,” Neurocomputing, vol. 378, pp. 112-119, 2020.
[25] C. -Y. Wang, A. Bochkovskiy and H. -Y. M. Liao, "YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors," Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, pp. 7464-7475, 2023.
[26] REDMON Joseph, et al., “You only look once: Unified, real-time object detection,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016.
[27] REDMON Joseph, et al., “FARHADI, Ali. YOLO9000: better, faster, stronger,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263-7271, 2017.
[28] REDMON Joseph, et al., “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
[29] A. Bochkovskiy, C.-Y. Wang and H.-Y. M. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020.
[30] JOCHER Glenn, et al., "ultralytics/yolov5: v6. 2-yolov5 classification models, apple m1, reproducibility, clearml and deci. ai integrations," Zenodo, 2022.
[31] LI, Chuyi, et al., “YOLOv6: A single-stage object detection framework for industrial applications,” arXiv preprint arXiv:2209.02976, 2022.
[32] Cao Zhe, et al., "Realtime multi-person 2d pose estimation using part affinity fields," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7291-7299, 2017.
[33] Lugaresi Camillo, et al., "Mediapipe: A framework for perceiving and processing reality." Third workshop on computer vision for AR/VR at IEEE computer vision and pattern recognition (CVPR), vol. 2019, 2019.
[34] TensorFlow Blog, "Next-Generation Pose Detection with MoveNet and TensorFlow.js" Retrieved from https://blog.tensorflow.org/2021/05/next-generation-pose-detection-with-movenet-and-tensorflowjs.html
[35] Wang, Yihan, et al., "Lite pose: Efficient architecture design for 2d human pose estimation," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126-13136, 2022.
[36] Maji, Debapriya, et al., "Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2637-2646, 2022.
[37] Chou, Ting-Hsuan and Wei-Ta Chu, "Automatic baseball pitch overlay," Proceedings of the 2021 International Conference on Multimedia Retrieval, pp. 599-602 2021.
[38] Ishii, Brian, “Using Pitch Tipping for Baseball Pitch Prediction,” PhD Thesis. California Polytechnic State University, 2021.
[39] Wen, Bor-Jiunn, et al., "Magnus-forces analysis of pitched-baseball trajectories using yolov3-tiny deep learning algorithm," Applied Sciences, vol. 12, iss. 11, pp. 5540, 2022.
[40] Y. -C. Li, C. -T. Chang, C. -C. Cheng and Y. -L. Huang, "Baseball Swing Pose Estimation Using OpenPose," Proceedings of the 2021 IEEE International Conference on Robotics, Automation and Artificial Intelligence (RAAI), Hong Kong, Hong Kong, pp. 6-9, 2021.
[41] Kim Byeong Jo and Yong Suk Choi, "Automatic baseball commentary generation using deep learning," Proceedings of the 35th Annual ACM Symposium on Applied Computing, pp.1056-1065, 2020.
[42] Huang, Mei-Ling and Yun-Zhi Li., "Use of machine learning and deep learning to predict the outcomes of major league baseball matches," Applied Sciences, vol. 11, iss. 10, pp. 4499, 2021.
[43] Piergiovanni A. J. and Michael S. Ryoo, "Fine-grained activity recognition in baseball videos," Proceedings of the ieee conference on computer vision and pattern recognition workshops, pp. 1740-1748, 2018.
[44] Lee, Younghyun, et al., "Highlight-video generation system for baseball games," Proceedings of the 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), IEEE, pp. 1-4, 2020.
[45] Hirasawa, Kaito, et al., "Mvgan maximizing time-lag aware canonical correlation for baseball highlight generation," Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), IEEE, pp. 1-6, 2020.
[46] Hirasawa, Kaito, et al., "Important scene detection of baseball videos via time-lag aware deep multiset canonical correlation maximization," Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 1236-1240, 2020.
[47] Liang, Zhanhao and Batyrkanov Jenish Isakunovich, "Baseball Action Classification Based on OpenPose," Academic Journal of Science and Technology, vol. 8, no. 2, pp. 62-64, 2023.
[48] R. Ikeda, K. Sakamoto and Y. Ueda, "Breaking News System of At-Bat Results From Sports Commentary via Speech Recognition," IEEE Access, vol. 12, pp. 27199-27209, 2024.
[49] AJ Piergiovanni, “MLB-YouTube Dataset”, Retrieved from https://github.com/piergiaj/mlb-youtube
[50] SSGLANDERS, “Doosan Bears vs SK”, Retrieved from https://www.youtube.com/watch?v=a508OKL0NLc
[51] Chien-Yao Wang, “yolov7-pose”, Retrieved from https://github.com/WongKinYiu/yolov7/tree/pose

QR CODE