研究生: |
王珮如 Pei-Ru Wang |
---|---|
論文名稱: |
基於強化學習模型之 Ludo 遊戲 AI Bot 之研究 A Study on AI Bot for Ludo Game Based on Reinforcement Learning Model |
指導教授: |
戴文凱
Wen-Kai Tai |
口試委員: |
范欽雄
Chin-Shyurng Fahn 王學武 Hsueh-Wu Wang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 40 |
中文關鍵詞: | 強化學習 、Ludo 、遊戲AI 、機器學習 、強化學習方法比較 、Epsilon Greedy 、低資源需求 |
外文關鍵詞: | Reinforcement Learning, Ludo, game AI, Machine Learning, Comparison of reinforcement learning methods, Epsilon Greedy, Low Resources Needed |
相關次數: | 點閱:378 下載:10 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在遊戲領域中,AI一直都是非常重點研究項目,從早期的規則式AI、強化學習AI到現在結合深度學習技術的深度強化學習AI。傳統的規則式AI必須需要開發者熟悉該遊戲的操作及各式各樣的演算法及資料結構,且幾乎任何遊戲都無法通用同一種架構模式,但強化學習AI不需要太大改變即可於大部分的環境下使用,不再像是傳統AI需要個別設計。而深度強化學習則更進一步使強化學習對於計算能力及記憶體需求部分大幅下降,使得強化學習AI能結合深度學習之技術,應用於更為複雜的遊戲。雖然強化學習核心幾乎能適用於所有遊戲,但在需要短時間內完成且無強大運算能力的情形下,並不是所有的強化學習方法皆適用於這種情況。因此,如何找到能在運算能力不足的情形下於短時間內訓練完成的強化學習方法是本論文的主要研究目標。
本論文我們以Ludo遊戲來嘗試各種不同的強化學習方法與參數調整,並提出了一種能使AI可以更加穩定訓練的一種探索方法,以及適用於Ludo遊戲更為強大的Rule-Based AI來讓強化學習AI更好的學習與測試。本論文限制了實驗模型必須在使用非專用運算卡情況下只能進行48小時的測試,我們嘗試了State-Action Value Function、State Value Function兩種強化學習方式與不同Step設定於Multi-Step Reinforcement Learning方法,並且提出更加穩定的探索方法對於時間與勝率實驗測試,並將調整參數後訓練完成的強化學習AI。
將我們參數調整後的強化學習AI與以往同樣遊戲的AI進行比較後的結果表明,我們的AI確實能在短時間內於硬體限制的情形下完成訓練,勝率比以往的AI提高5~6%。
Game AI is one of the important things in game development. Traditional Rule-Based AI need developer to design a lot of rules and algorithm. Reinforcement Learning based AI does not need to redesign for different games. Deep Reinforcement Learning is a technique that mixes Reinforcement Learning and Deep Learning. It can solve some situation e.g., out-of-memory memory with Q-Learning Algorithm.
Although Deep Reinforcement Learning is very powerful, it takes a lot of time for the machine to learn. Therefore, this thesis aims to find what model can be run with limited time and computing resources.
In this thesis, we research Ludo game's AI Bot based on Reinforcement Learning Model. We propose a reinforcement learning exploration method that can make AI training more stable. Furthermore, we proposed a Rule-Based AI that can make AI training faster and test whether the AI is powerful. We study on comparison of State-Action Value Function and State Value Function with limited hardware and 48 hours of training time. In addition, we experiment with Multi-Step Reinforcement Learning Model with different settings.
By comparing our Reinforcement Learning AI with previous AI on Ludo Game, the experimental results show that our AI outperforms previous AI at least 5%.
[1] “Ludo (board game) - Wikipedia,” [Online]. Available: https://en.wikipedia.org/wiki/Ludo_(board_game).
[2] F. Alvi and M. Ahmed, “Complexity Analysis and Playing Strategies for Ludo and its Variant Race Games,” in IEEE Conference on Computational Intelligence and Games (CIG), pp. 134-141, 2011.
[3] Alhajry, Majed, F. Alvi and M. Ahmed, “TD (λ) and Q-learning based Ludo players,” in IEEE Conference on Computational Intelligence and Games (CIG), 2012.
[4] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction, MIT press, 1998.
[5] C. Watkins and P. Dayan, “Q-Learning,” Machine Learning, vol. 8, no. 3-4, 1992.
[6] R. Sutton, “Learning to predict by the methods of temporal differences,” Machine Learning, pp. 9-44, 1988.
[7] C. Watkins, “Learning from delayed rewards,” Phd thesis, Cambridge Univ., England, 1989.
[8] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller,“Playing atari with deep reinforcement learning,” arXiv preprint arXiv:1312.5602, 2013.
[9] M. G. Bellemare, Y. Naddaf, J. Veness, and M. Bowling,“The arcade learning environment: An evaluation platform for general agents.,” Journal of Artificial Intelligence Research, vol. 47, pp. 253-279, 2013.
[10] M. Bellemare, J. Veness, and M. Bowling,“Investigating contingency awareness using atari 2600 games.,”in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 26, 2012.
[11] M. Hausknecht, J. Lehman, R. Miikkulainen, and P. Stone,“A neuro-evolution approach to general atari game playing.,”in IEEE Transactions on Computational Intelligence and AI in Games, vol. 6, no. 4, pp. 355-366, 2014.
[12] De Asis, K., Hernandez-Garcia, J. F., Zacharias Holland, G., and Sutton, R. S. “Multi-step Reinforcement Learning: A Unifying Algorithm,” in AAAI Conference on Artificial Intelligence (AAAI), 2017.