簡易檢索 / 詳目顯示

研究生: 鄧仲恩
Zhong-En Deng
論文名稱: 運用多元深度學習模型進行角子機遊戲資訊提取與互動性辨識系統建置
Slot Machine Game Information Extraction And Interactive Recognition System Using Multiple Deep Learning Models
指導教授: 戴文凱
Wen-Kai Tai
口試委員: 金台齡
Tai-Lin Chin
紀明德
Ming-Te Chi
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 103
中文關鍵詞: 角子機遊戲遊戲文字辨識模式辨識狀態辨識
外文關鍵詞: Slot Game, Game Text Recognition, Game Mode Recognition, Game State Recognition
相關次數: 點閱:287下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 角子機起初全機械結構,後發展電子式角子機,更吸引玩家遊玩。近年大量線上角子機 遊戲,為遊戲公司帶來商機,但開發獨特且吸引人的遊戲頗具挑戰。故可參考成功角子 機遊戲,研究其給獎設定。但獲取其設定需要大量遊玩,並以人工抄寫遊戲資訊,十分 費時費力。故本論文將透過影像處理自動化提取資訊,減少成本並提高準確性。同時讓 分析多款線上角子機遊戲給獎設定,成為可能。
    本論文分別辨識遊戲文字、遊戲模式、停轉輪狀態、符號位置與類型等五種基本資 訊,並提出一套系統與使用者介面,用以調整並整合諸多方法用來辨識不同角子機遊戲。 本論文使用 Parseq、ABINet 與 TRBA 進行遊戲文字辨識;使用圖片比對來辨識遊戲模 式,達到 100% 之準確率;使用 RAFT 與基因演算法來辨識停轉輪狀態,達到 98.4% 以 上的召回率;使用 YOLOv7 結合自動化生成訓練集辨識符號位置,達到 99.4% 以上的 mAP@0.5;使用 ResNet50、VGG16、ViT、CLIP 與 MAE,抽取圖像特徵並分群來辨 識符號類型,在不經訓練之前提下,即可達到 99.8% 以上之準確率。最後將上述方法整 合後,最終遊戲的準確度高達 97.9% 以上。


    In recent years, there are a large number of online slot machine games which have brought business opportunities for gaming companies. However, developing unique and appealing games is quite challenging. Therefore, it is essential to study and imitate the prize settings of successful slot machine games. To obtain these settings requires a lot of spins and manual transcription, which is time-consuming and labor-intensive. Hence, we aim to automate the extraction of prize setting information through image processing to reduce costs and improve accuracy.
    This thesis focuses on identifying essential information: game text, game mode, reel state, symbol positions, and symbol types. We propose a system with friendly user inter- face to adjust and integrate various methods for recognizing different slot machine games. Parseq, ABINet, and TRBA are exploit for game text recognition and achieve 100% ac- curacy in game mode recognition through image comparison, and a recall rate of over 98.4% in reel state recognition using RAFT and genetic algorithms. Fourth, we achieve over 99.4% mAP@0.5 in symbol position recognition by combining YOLOv7 with au- tomatic training set generation. To identify symbol types, we extract and cluster image features to have over 99.8% accuracy using ResNet50, VGG16, ViT, CLIP, and MAE without any training. Finally, by integrating the proposed methods, the overall accuracy of the game setting recognition reaches 97.9%.

    推薦書 . . . . . .. . . . . . . . . . . . . . . . . . . . I 審定書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II 論文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV 誌謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V 目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI 圖目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . X 表目錄 . . . . . . . . .. . . . . . . . . . . . . . . . . . . . XIII 演算法目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 緒論 . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 研究背景與動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 研究目標 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 研究方法概述 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.4 研究貢獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.5 本論文之章節結構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 相關工作 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 遊戲資訊提取 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2 場景文字辨識 (Scene Text Recognition) . . . . . . . . . . . . . . . . . . 5 2.2.1 卷 機 循 環 神 經 網 路 (Convolutional Recurrent Neural Network, CRNN) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2.2 TPS-ResNet-BiLSTM-Attn(TRBA) . . . . . . . . . . . . . . . . 8 2.2.3 Permuted Autoregressive Sequence(PARSeq) . . . . . . . . . . . 10 2.3 You Only Look Once v7(YOLOv7) . . . . . . . . . . . . . . . . . . . . 11 2.4 光流估計 (Optical Flow Estimation) . . . . . . . . . . . . . . . . . . . . 13 2.5 影像特徵提取 (Feature extraction) . . . . . . . . . . . . . . . . . . . . . 15 2.5.1 卷積神經網路 (Convolutional Neural Network, CNN) . . . . . . . 15 2.5.2 Vision Transformer(ViT) . . . . . . . . . . . . . . . . . . . . . . 16 2.5.3 Masked Autoencoder(MAE) . . . . . . . . . . . . . . . . . . . . 17 2.5.4 Contrastive Language-Image Pre-training(CLIP) . . . . . . . . . 18 3 研究方法 . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1 文字辨識 (Text Recognition) . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.1 文字辨識方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.1.2 測試資料集產出 . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 遊戲模式辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.3 轉停輪狀態辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3.1 光流估計 (Optical Flow Estimation) . . . . . . . . . . . . . . . . 28 3.3.2 轉停輪狀態辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3.3 基因演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.4 符號偵測 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.4.1 符號偵測測試集 . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.4.2 符號偵測訓練集 . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.4.3 符號偵測訓練方法 . . . . . . . . . . . . . . . . . . . . . . . . . 37 3.5 符號辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.5.1 符號辨識方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.5.2 符號辨識測試集 . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.6 整合辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.6.1 通用辨識方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.6.2 針對辨識方法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.7 辨識系統 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.7.1 辨識流程資料結構 . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.7.2 辨識系統執行方法 . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.7.3 辨識系統之使用者介面 . . . . . . . . . . . . . . . . . . . . . . . 53 4 實驗結果與分析 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.1 實驗環境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.2 文字辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3 遊戲模式辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.4 轉停輪狀態辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.5 符號辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.6 符號偵測 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.7 整合辨識 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.8 辨識系統 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5 結論與未來工作 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.1 結論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 5.2 未來工作 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 參考文獻 . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . 76 附錄一:參數銀行保留參數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 6.1 影片辨識系統必要參數 . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 6.2 角子機必要保留參數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 附錄二:辨識流程輸入參數方式 . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 附錄三:產生符號偵測訓練資料集的參數設定 . . . . . . . . . . . . . . . . . . . . 86

    [1] 陳品劭, “程序化遊戲設計: 基於差分進化演算法之老虎機自動化遊戲體驗生成研 究,” 2022.
    [2] B. Shi, X. Bai, and C. Yao, “An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition,” IEEE Transac- tions on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2298–2304, 2016.
    [3] J. Baek, G. Kim, J. Lee, S. Park, D. Han, S. Yun, S. J. Oh, and H. Lee, “What is wrong with scene text recognition model comparisons? dataset and model analysis,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 4715–4723.
    [4] D. Bautista and R. Atienza, “Scene text recognition with permuted autoregressive sequence models,” arXiv preprint arXiv:2207.06966, 2022.
    [5] Z. Teed and J. Deng, “Raft: Recurrent all-pairs field transforms for optical flow,” in Proceedings of the European Conference on Computer Vision. Springer, 2020, pp. 402–419.
    [6] M. Wang and W. Deng, “Deep face recognition: A survey,” Neurocomputing, vol. 429, pp. 215–244, 2021.
    [7] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
    [8] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
    [9] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
    [10] K. He, X. Chen, S. Xie, Y. Li, P. Dollár, and R. Girshick, “Masked autoencoders are scalable vision learners,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 16 000–16 009.
    [11] A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark et al., “Learning transferable visual models from natural language supervision,” in International Conference on Machine Learning (ICML). PMLR, 2021, pp. 8748–8763.
    [12] G. Wölflein and O. Arandjelović, “Determining chess game state from an image,” Journal of Imaging, vol. 7, no. 6, 2021.
    [13] H. Pidaparthy, M. H. Dowling, and J. H. Elder, “Automatic play segmentation of hockey videos,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 4585–4593.
    [14] C. Trivedi, K. Makantasis, A. Liapis, and G. N. Yannakakis, “Learning task- independent game state representations from unlabeled images,” in IEEE Conference on Games (CoG). IEEE, 2022, pp. 88–95.
    [15] C. Ringer and M. A. Nicolaou, “Deep unsupervised multi-view detection of video game stream highlights,” in Proceedings of the 13th International Conference on the Foundations of Digital Games (FDG), 2018, pp. 1–6.
    [16] W. Zhu, J. Lou, L. Chen, Q. Xia, and M. Ren, “Detection examples of the proposed method on the ICDAR 2013 dataset [17].” 8 2017. [On- line]. Available: https://plos.figshare.com/articles/figure/Detection_examples_of_ the_proposed_method_on_the_ICDAR_2013_dataset_17_/5325856
    [17] B. Shi, X. Wang, P. Lyu, C. Yao, and X. Bai, “Robust scene text recognition with automatic rectification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4168–4176.
    [18] 萌娘百科, “漢字的序順並不定一能影閱響讀,” 2020. [Online]. Available: https:// mzh.moegirl.org.cn/index.php?title=%E6%B1%89%E5%AD%97%E7%9A%84% E9%A1%BA%E5%BA%8F%E5%B9%B6%E4%B8%8D%E4%B8%80%E5% AE%9A%E8%83%BD%E5%BD%B1%E5%93%8D%E9%98%85%E8%AF%BB
    [19] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors,” in Proceedings of the IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 7464–7475.
    [20] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2117–2125.
    [21] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, “Path aggregation network for instance seg- mentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8759–8768.
    [22] C.-Y. Wang, I.-H. Yeh, and H.-Y. M. Liao, “You only learn one representation: Uni- fied network for multiple tasks,” arXiv preprint arXiv:2105.04206, 2021.
    [23] B. K. Horn and B. G. Schunck, “Determining optical flow,” Artificial Intelligence, vol. 17, no. 1-3, pp. 185–203, 1981.
    [24] J.-Y. Bouguet et al., “Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm,” Intel corporation, vol. 5, no. 1-10, p. 4, 2001.
    [25] G. Farnebäck, “Polynomial expansion for orientation and motion estimation,” Ph.D. dissertation, Linköping University Electronic Press, 2002.
    [26] F. Steinbrücker, T. Pock, and D. Cremers, “Large displacement optical flow compu- tation without warping,” in IEEE 12th International Conference on Computer Vision (ICCV). IEEE, 2009, pp. 1609–1614.
    [27] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, and T. Brox, “Flownet 2.0: Evolution of optical flow estimation with deep networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2462–2470.
    [28] T.-W. Hui, X. Tang, and C. C. Loy, “Liteflownet: A lightweight convolutional neural network for optical flow estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8981–8989.
    [29] D. Sun, X. Yang, M.-Y. Liu, and J. Kautz, “Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8934–8943.
    [30] Y. Li, H. Mao, R. Girshick, and K. He, “Exploring plain vision transformer back- bones for object detection,” in Proceedings of the European Conference on Com- puter Vision (ECCV). Springer, 2022, pp. 280–296.
    [31] S. Fang, H. Xie, Y. Wang, Z. Mao, and Y. Zhang, “Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition,” in Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 7098–7107.
    [32] R. Atienza, “Vision transformer for fast and efficient scene text recognition,” in Pro- ceedings of the International Conference on Document Analysis and Recognition (ICDAR). Springer, 2021, pp. 319–334.
    [33] V. I. Levenshtein, “Binary codes capable of correcting deletions, insertions, and re- versals,” in Soviet Physics Doklady, vol. 10, no. 8. Soviet Union, 1966, pp. 707– 710

    無法下載圖示 全文公開日期 2026/08/02 (校內網路)
    全文公開日期 2026/08/02 (校外網路)
    全文公開日期 2026/08/02 (國家圖書館:臺灣博碩士論文系統)
    QR CODE