基於深度學習模型於自動化生成棒球投球影片之系統設計與應用

簡易檢索 / 詳目顯示

回結果列表

研究生：	曾柏堯 Bor-Yao Tseng
論文名稱：	基於深度學習模型於自動化生成棒球投球影片之系統設計與應用 Automated Baseball Pitching Video Generation System and Application Using Deep Learning Model
指導教授：	陳俊良 Jiann-Liang Chen
口試委員:	陳俊良黃能富趙涵捷郭斯彥
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	82
中文關鍵詞：	棒球賽事、機器學習、影像分析、投球影像數據、投球辨識
外文關鍵詞：	Baseball Game, Machine Learning, Image Analysis, Pitching Image Data, Pitching Recognition
相關次數：	點閱：179 下載：7
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

近年來，隨著全球對熱門運動領域的關注和投入不斷增加，相關的運動分析研究也蓬勃發展，這些研究通過運用先進的數據分析技術，揭示了運動員表現的關鍵因素，並對運動策略和訓練方法進行了深度探討，為運動項目的發展提供了強而有力的支持。現代運動以棒球為例，數據分析已經成為球隊決策的重要依據，通過分析球員的打擊率、投球速度、揮棒軌跡等數據，教練和數據分析師能夠精確地評估球員的狀態和潛力，這不僅有助於制定更有效的訓練計劃，還能根據對手的弱點制定出針對性的比賽策略，幫助教練團隊在比賽中做出即時的戰術調整，對整個球隊的戰術和比賽結果有著深遠的影響。
然而在運動數據處理上需要大量人力，這已經成為現代體育運動中不可忽視的一個重點，隨著數據技術的進步，運動數據的資料量不斷增加，在對棒球賽事中的投球影像進行分析時，需要從平均長達2小時40分鐘的比賽中整理出平均300個投球影像，其中數據的收集過程需要大量人力投入。
本研究為了解決投球影像數據收集過程需要大量人力的問題，提出了一個新穎的基於棒球賽事鏡頭的投球辨識方法，可將任意長度的棒球賽事影像作為輸入，系統會自動對影像內容進行分析，生成多個時間長度為3秒的投球影像，其對應輸入影像中所有的投球畫面。
本研究使用AI圖像辨識模型對輸入的棒球賽事影像進行初步辨識，提取影像中棒球座標與球員姿態資訊作為特徵，以本研究所設計之演算法計算出棒球移動軌跡方向與球員站位，自動化辨識輸入影像中的投球時間點，生成對應之投球影像資料集。本研究蒐集MLB與KBO賽事總計長達6小時57分鐘的影像做為測試資料集，precision可達到96.39%，recall可達到94.55%。

In recent years, with the increasing global attention and investment in popular sports domains, related sports analysis research has also flourished. These studies, utilizing advanced data analysis techniques, have unveiled key factors influencing athlete performance and have deeply explored sports strategies and training methods, thereby providing robust support for various sports development. For example, data analysis has become a critical basis for team decision-making in modern sports such as baseball. By analyzing batting averages, pitching speeds, and swing trajectories, coaches and data analysts can accurately assess players' conditions and potential. This not only aids in formulating more effective training plans but also enables the development of targeted game strategies based on opponents' weaknesses. Consequently, coaching teams can make real-time tactical adjustments during games, profoundly impacting the overall strategy and outcomes of the team.
However, processing sports data necessitates substantial human resources, which has become a significant focus in modern sports. With advancements in data technology, the volume of sports data continues to increase. For instance, analyzing pitching footage from baseball games requires extracting an average of 300 pitching images from a game that typically lasts 2 hours and 40 minutes. The data collection process in this context demands considerable human labor.
This study proposes a novel method based on baseball game footage to address the labor-intensive process of collecting pitching image data. The method accepts baseball game footage of any length as input. The system automatically analyzes the content of the footage and generates multiple 3-second pitching images corresponding to all pitching scenes within the input footage.
This study employs an AI image recognition model to conduct initial identification of baseball game footage inputs. It extracts baseball coordinates and player posture information from the images as features. An algorithm designed in this study computes the trajectory direction of the baseball and player positions, facilitating automated identification of pitching timestamps within the input footage. Consequently, it generates corresponding pitching image datasets. For testing, this research collected a total of 6 hours and 57 minutes of MLB and KBO game footage, achieving a precision of 96.39% and a recall of 94.55%.

摘要	I
Abstract	II
List of Figures	VII
List of Tables	IX
Chapter 1	Introduction	1
1	Motivation	1
2	Contributions	7
3	Organization	9
Chapter 2	Related Work	10
1	Baseball Sport Data Analysis	10
2	AI Image Recognition	13
3	AI Human Pose Estimation	17
4	AI Applications in Baseball	19
5	Baseball Image Activity Recognition	22
Chapter 3	Proposed System	25
1	System Architecture	25
2	Data Collection	26
2.1	MLB-YouTube Dataset	26
2.2	KBO YouTube	28
3	Data Processing	28
4	Detection Model	29
4.1	Baseball Detection Model	29
4.2	Baseball Detection Model	33
5	Video Generation	34
5.1	Ball Trajectory Direction Analysis	35
5.2	Pitching Camera Analysis	38
5.2.1	Identify Pitcher	39
5.2.2	Identify Catcher and Batter	41
5.2.3	Pitching Video Generation	44
Chapter 4	Performance Analysis	46
1	System Environment	46
1.1	Experiment Environment	46
1.2	Experiment Parameter	47
2	Performance Evaluation Metrics	49
3	Performance Analysis	51
3.1	System Performance	51
3.2	Failed Sample Analysis	55
4	Comparison to Other Studies	56
Chapter 5	Conclusions and Future Works	60
1	Conclusions	60
2	Future Works	61
References	64
                                

[1] Statista, “Major League Baseball average franchise value from 2002 to 2024” , Retrieved from https://www.statista.com/statistics/193441/average-franchise-value-in-mlb-since-2000/#statisticContainer
[2] Sabr, “A Guide to Sabermetric Research”, Retrieved from https://sabr.org/sabermetrics
[3] ELITZUR Ramy., “Data analytics effects in major league baseball,” Omega, vol. 90, pp. 102001, 2020
[4] G. Healey, "The New Moneyball: How Ballpark Sensors Are Changing Baseball," Proceedings of the IEEE, vol. 105, no. 11, pp. 1999-2002, 2017
[5] Sportsv, 「肉眼看不見的棒球—認識大聯盟數據分析系統」, Retrieved from https://www.sportsv.net/articles/60638
[6] CHASE, Christina, “The data revolution: Cloud computing, artificial intelligence, and machine learning in the future of sports,” Proceedings of the 21st century sports: How technologies will change sports in the digital age, pp. 175-189, 2020
[7] MIZELS Joshua, ERICKSON Brandon and CHALMERS Peter, “Current state of data and analytics research in baseball,” Current reviews in musculoskeletal medicine, vol. 15, pp. 283-290, 2022
[8] Healey, Glenn, "Combining Radar and Optical Sensor Data to Measure Player Value in Baseball" Sensors, vol. 21, iss. 1, pp. 64, 2021
[9] MLB, “What is sabermetrics?”, Retrieved from https://www.mlb.com/news/sabermetrics-in-baseball-a-casual-fans-guide
[10] Sabr, “F. C. Lane”, Retrieved from https://sabr.org/bioproj/person/f-c-lane/
[11] COSTA Gabriel B., HUBER Michael R., SACCOMAN John T., “Understanding sabermetrics: An introduction to the science of baseball statistics,” McFarland, 2019.
[12] M. Lage, J. P. Ono, D. Cervone, J. Chiang, C. Dietrich and C. T. Silva, "StatCast Dashboard: Exploration of Spatiotemporal Baseball Data," Proceedings of the IEEE Computer Graphics and Applications, vol. 36, no. 5, pp. 28-37, Sept.-Oct. 2016.
[13] Huang, J.-H., and Hsu, Y.-C., “A Multidisciplinary Perspective on Publicly Available Sports Data in the Era of Big Data: A Scoping Review of the Literature on Major League Baseball,” Sage Open, vol. 11, pp. 4, 2021.
[14] R. Chauhan, K. K. Ghanshala and R. C. Joshi, "Convolutional Neural Network (CNN) for Image Detection and Recognition," Proceedings of the First International Conference on Secure Cyber Computing and Communication (ICSCCC), Jalandhar, India, pp. 278-282, 2018.
[15] Z. Li, F. Liu, W. Yang, S. Peng and J. Zhou, "A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects," Proceedings of the IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 12, pp. 6999-7019, Dec. 2022.
[16] KOONCE Brett and KOONCE Brett., “ResNet 50,” Convolutional neural networks with swift for tensorflow: image recognition and dataset categorization, pp. 63-72, 2021.
[17] K. He, X. Zhang, S. Ren and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[18] A. Krizhevsky, “Learning multiple layers of features from tiny images Technical Report,” University of Toronto, 2009.
[19] K. He, X. Zhang, S. Ren and J. Sun , “Identity mappings in deep residual networks,” Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, pp. 630-645, 2016.
[20] C. Szegedy, S. Ioffe, V. Vanhoucke and A. Alemi , “Inception-v4, Inception-Resnet and the impact of residual connections on learning,” Proceedings of the AAAI conference on artificial intelligence, vol. 31, no. 1, pp. 4278-4284, 2017.
[21] H. Zhao, J. Shi, X. Qi, X. Wang and J. Jia, “Pyramid scene parsing network,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6230-6239, 2017.
[22] Srikanth Tammina, “Transfer learning using VGG-16 with Deep Convolutional Neural Network for Classifying Images,” Proceedings of the International Journal of Scientific and Research Publications (IJSRP), vol. 9, iss. 10, pp. 143-150 , 2019.
[23] SUSHMA L. and LAKSHMI K. P.,” An Analysis of Convolution Neural Network for Image Classification using Different Models,” Proceedings of the International Journal of Engineering Research & Technology (IJERT), vol. 9, iss. 10, 2020.
[24] BASHA SH Shabbeer, et al.,” Impact of fully connected layers on performance of convolutional neural networks for image classification,” Neurocomputing, vol. 378, pp. 112-119, 2020.
[25] C. -Y. Wang, A. Bochkovskiy and H. -Y. M. Liao, "YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors," Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, pp. 7464-7475, 2023.
[26] REDMON Joseph, et al., “You only look once: Unified, real-time object detection,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016.
[27] REDMON Joseph, et al., “FARHADI, Ali. YOLO9000: better, faster, stronger,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7263-7271, 2017.
[28] REDMON Joseph, et al., “Yolov3: An incremental improvement,” arXiv preprint arXiv:1804.02767, 2018.
[29] A. Bochkovskiy, C.-Y. Wang and H.-Y. M. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020.
[30] JOCHER Glenn, et al., "ultralytics/yolov5: v6. 2-yolov5 classification models, apple m1, reproducibility, clearml and deci. ai integrations," Zenodo, 2022.
[31] LI, Chuyi, et al., “YOLOv6: A single-stage object detection framework for industrial applications,” arXiv preprint arXiv:2209.02976, 2022.
[32] Cao Zhe, et al., "Realtime multi-person 2d pose estimation using part affinity fields," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7291-7299, 2017.
[33] Lugaresi Camillo, et al., "Mediapipe: A framework for perceiving and processing reality." Third workshop on computer vision for AR/VR at IEEE computer vision and pattern recognition (CVPR), vol. 2019, 2019.
[34] TensorFlow Blog, "Next-Generation Pose Detection with MoveNet and TensorFlow.js" Retrieved from https://blog.tensorflow.org/2021/05/next-generation-pose-detection-with-movenet-and-tensorflowjs.html
[35] Wang, Yihan, et al., "Lite pose: Efficient architecture design for 2d human pose estimation," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13126-13136, 2022.
[36] Maji, Debapriya, et al., "Yolo-pose: Enhancing yolo for multi person pose estimation using object keypoint similarity loss," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2637-2646, 2022.
[37] Chou, Ting-Hsuan and Wei-Ta Chu, "Automatic baseball pitch overlay," Proceedings of the 2021 International Conference on Multimedia Retrieval, pp. 599-602 2021.
[38] Ishii, Brian, “Using Pitch Tipping for Baseball Pitch Prediction,” PhD Thesis. California Polytechnic State University, 2021.
[39] Wen, Bor-Jiunn, et al., "Magnus-forces analysis of pitched-baseball trajectories using yolov3-tiny deep learning algorithm," Applied Sciences, vol. 12, iss. 11, pp. 5540, 2022.
[40] Y. -C. Li, C. -T. Chang, C. -C. Cheng and Y. -L. Huang, "Baseball Swing Pose Estimation Using OpenPose," Proceedings of the 2021 IEEE International Conference on Robotics, Automation and Artificial Intelligence (RAAI), Hong Kong, Hong Kong, pp. 6-9, 2021.
[41] Kim Byeong Jo and Yong Suk Choi, "Automatic baseball commentary generation using deep learning," Proceedings of the 35th Annual ACM Symposium on Applied Computing, pp.1056-1065, 2020.
[42] Huang, Mei-Ling and Yun-Zhi Li., "Use of machine learning and deep learning to predict the outcomes of major league baseball matches," Applied Sciences, vol. 11, iss. 10, pp. 4499, 2021.
[43] Piergiovanni A. J. and Michael S. Ryoo, "Fine-grained activity recognition in baseball videos," Proceedings of the ieee conference on computer vision and pattern recognition workshops, pp. 1740-1748, 2018.
[44] Lee, Younghyun, et al., "Highlight-video generation system for baseball games," Proceedings of the 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), IEEE, pp. 1-4, 2020.
[45] Hirasawa, Kaito, et al., "Mvgan maximizing time-lag aware canonical correlation for baseball highlight generation," Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), IEEE, pp. 1-6, 2020.
[46] Hirasawa, Kaito, et al., "Important scene detection of baseball videos via time-lag aware deep multiset canonical correlation maximization," Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), IEEE, pp. 1236-1240, 2020.
[47] Liang, Zhanhao and Batyrkanov Jenish Isakunovich, "Baseball Action Classification Based on OpenPose," Academic Journal of Science and Technology, vol. 8, no. 2, pp. 62-64, 2023.
[48] R. Ikeda, K. Sakamoto and Y. Ueda, "Breaking News System of At-Bat Results From Sports Commentary via Speech Recognition," IEEE Access, vol. 12, pp. 27199-27209, 2024.
[49] AJ Piergiovanni, “MLB-YouTube Dataset”, Retrieved from https://github.com/piergiaj/mlb-youtube
[50] SSGLANDERS, “Doosan Bears vs SK”, Retrieved from https://www.youtube.com/watch?v=a508OKL0NLc
[51] Chien-Yao Wang, “yolov7-pose”, Retrieved from https://github.com/WongKinYiu/yolov7/tree/pose

簡易檢索 / 詳目顯示

相關論文