基於倒推法與機器學習之人性化撞球 AI 研發｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	董奎谷 Kuei-Gu Tung
論文名稱：	基於倒推法與機器學習之人性化撞球 AI 研發 A Study on Human-like Billiard AI Bot Based on Backward Induction and Machine Learning
指導教授：	戴文凱 Wen-Kai Tai
口試委員:	張國清陳怡伶
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	65
中文關鍵詞：	撞球、8號球AI 、人工智慧、倒推法、機器學習、模仿學習
外文關鍵詞：	Billiards, 8-ball AI, Artificial Intelligence, Backward Induction, Machine Learning, Imitation Learning
相關次數：	點閱：234 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來應用於遊戲領域的 AI 之相關研究方向大多為提升 AI 強度，使其能與人類抗衡，甚至超越職業選手的水準，例如知名的圍棋 AI：AlphaGo。但普遍地談及玩家的遊戲體驗，AI 的考量因素還必須納入一項重要指標：行為的似人程度，讓玩家在對戰的過程中能接受對手合理的思維，而不會被異於常人的 AI 能力水平影響對遊戲的觀感。

本研究以撞球 AI 為題材，描述普通 AI 與人類決策的差異，並分析玩家實機遊玩資料，撈取打擊的特徵向量，利用 Backward Induction 演算法與機器學習來模仿逼近人類撞球玩家之決策過程，給予 AI 擬定策略與打擊的建議，避免過度倚賴強大精確的物理模擬計算。

本研究將人性化策略套用至 AI，模仿人類思維，較原始版本更為接近實機玩家所選擇之球路與桿法，並定義出適當評估似人性之量測方法，來衡量方法的有效性，在 AI 強度與似人性之間取得平衡，進而提供更好的遊戲體驗。在實驗結果中，我們提出的方法相較於原本的 AI 整體上更接近人類玩家的打法，證明方法能有效提升 AI 的似人程度。

In recent years, most of researches in the field of game Artificial Intelligence (AI) have focused on improving the strength of AI. Those researched AIs have the ability to compete with human players, even surpass professional players. For example, the well-known Go AI: AlphaGo. However, an important factor has to be considered for AI when generally speaking of gameplay experience: the human likeness of its behavior. It makes players feel acceptable to reasonable thoughts of opponents, instead of leaving bad gaming impression for players due to AI’s overpowered ability.

This paper studies AI in billiard games, describing the difference between the decisions of general AIs and human players. We analyzed actual game records of human players and retrieved feature vectors from the data. We leveraged the Backward Induction algorithm and machine learning to imitate the process of making decisions from human players. Providing our AI suggestion of strategies, it could avoid being over-dependent on the robust and precise physics simulation.

This study applies human-like strategies to AI which could mimic human thoughts. It is more similar to actual human players in a sense of the way they played while being compared to the original AI. Also, we defined an appropriate approach to gauge the human likeness of AI, evaluating our proposed methods. Perhaps our proposed methods could keep the balance between the ability and human likeness of AI to further provide better gameplay experience. The experimental results show that our method overall is more similar to the way how human players play than the original AI, proving that it could effectively improve the human likeness of AI.

論文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV
Table of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII
Table of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1 Billiards AI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Imitation Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Other Game AIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1 Settings and Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2 Rule-based Backward Induction . . . . . . . . . . . . . . . . . . . . . . 16
2.1 The Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Track Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 Hybrid Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.1 Neural Network Model Construction . . . . . . . . . . . . . . . . 25
3.2 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Data Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4 The Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
1 Experimental Environment . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 Training Results of Neural Network Models . . . . . . . . . . . . . . . . 34
3 Comparison with the Data . . . . . . . . . . . . . . . . . . . . . . . . . 36
4 User Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.1 The Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.2 Overall Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.3 Particular Aspects . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4 Complexity of Tables . . . . . . . . . . . . . . . . . . . . . . . . 43
5 Computation Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6 Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
                                

[1] D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of go with deep neural networks and tree search,” Nature, vol. 529, pp. 484–489, 01 2016.

[2] D. Livingstone, “Turing’s test and believable AI in games,” Comput. Entertain., vol. 4, no. 1, Jan. 2006. [Online]. Available: http://doi.acm.org/10.1145/1111293.1111303

[3] X.-T. Zhou and W.-K. Tai, “Styleable NPC behavior –a case study of pool game,” Master’s thesis, National Taiwan University of Science and Technology, 2017.

[4] J. Landry, J. Dussault, and P. Mahey, “A heuristic-based planner and improved controller for a two-layered approach for the game of billiards,” IEEE Transactions on Computational Intelligence and AI in Games, vol. 5, no. 4, pp. 325–336, Dec 2013.

[5] M. Smith, “Running the table: An AI for computer billiards,” in AAAI, vol. 1, 01 2006.

[6] C. B. Browne, E. Powley, D. Whitehouse, S. M. Lucas, P. I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A survey of monte carlo tree search methods,” IEEE Transactions on Computational Intelligence and AI in Games, vol. 4, no. 1, pp. 1–43, March 2012.

[7] M. Greenspan, “Uofa wins the pool tournament,” ICGA Journal, vol. 28, pp. 191–193, 09 2005.

[8] D. Alciatore, The Illustrated Principles of Pool and Billiards, New York, NY, USA: Sterling, 2004.

[9] J.-F. Landry, J.-P. Dussault, and P. Mahey, “A robust controller for a two-layered approach applied to the game of billiards,” Entertainment Computing, vol. 3, no. 3, pp. 59–70, 2012.

[10] D. Churchill and M. Buro, “Hierarchical portfolio search: Prismata’s robust AI architecture for games with large search spaces,” in Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference, 2015.

[11] L. Cardamone, D. Loiacono, and P. L. Lanzi, “Learning drivers for TORCS through imitation using supervised methods,” in 2009 IEEE Symposium on Computational Intelligence and Games, Sep. 2009, pp. 148–155.

[12] “The open racing car simulator website.” [Online]. Available: https://sourceforge.net/projects/torcs/

[13] P. Cunningham and S. J. Delany, “k-nearest neighbour classifiers,” Multiple Classifier Systems, vol. 34, no. 8, pp. 1–17, 2007.

[14] K. O. Stanley and R. Miikkulainen, “Evolving neural networks through augmenting topologies,” Evol. Comput., vol. 10, no. 2, pp. 99–127, Jun. 2002. [Online]. Available: http://dx.doi.org/10.1162/106365602320169811

[15] C. Renman, “Creating human-like AI movement in games using imitation learning,” 2017.

[16] M. Kim and K. Kim, “Opponent modeling based on action table for MCTS-based
fighting game AI,” in 2017 IEEE Conference on Computational Intelligence and
Games (CIG), Aug 2017, pp. 178–180.

[17] J. Denzinger and K. Randall, “Enhancing tree-based (stochastic) search by learning from previous experience,” Stochastic Search Algorithms, 2003.

[18] C. Holmgård, A. Liapis, J. Togelius, and G. N. Yannakakis, “Monte-Carlo tree search for persona based player modeling,” in Eleventh Artificial Intelligence and Interactive Digital Entertainment Conference, 2015.

[19] H. J. van den Herik, H. Donkers, and P. H. Spronck, “Opponent modelling and commercial games,” Proceedings of the IEEE Symposium on Computational Intelligence and Games CIG’05, pp. 15–25, 2005.

[20] R. Moraes, J. Mariño, and L. Lelis, “Nested-greedy search for adversarial real-time games,” 2018. [Online]. Available: https://aaai.org/ocs/index.php/AIIDE/AIIDE18/paper/view/18098

[21] D. Churchill and M. Buro, “Portfolio greedy search and simulation for large-scale combat in starcraft,” in 2013 IEEE Conference on Computational Inteligence in Games (CIG), Aug 2013, pp. 1–8.

[22] S. Ontañón and M. Buro, “Adversarial hierarchical-task network planning for
complex real-time games,” in Proceedings of the 24th International Conference on
Artificial Intelligence, ser. IJCAI’15. AAAI Press, 2015, pp. 1652–1658. [Online]. Available: http://dl.acm.org/citation.cfm?id=2832415.2832479

[23] S. Ontañón, “Combinatorial multi-armed bandits for real-time strategy games,” J. Artif. Int. Res., vol. 58, no. 1, pp. 665–702, Jan. 2017. [Online]. Available: http://dl.acm.org/citation.cfm?id=3176764.3176780

[24] N. A. Barriga, M. Stanescu, and M. Buro, “Game tree search based on nondeterministic action scripts in real-time strategy games,” IEEE Transactions on Games, vol. 10, no. 1, pp. 69–77, March 2018.

[25] N. A. Barriga, M. Stanescu, and M. Buro, “Combining strategic learning and
tactical search in real-time strategy games,” CoRR, vol. abs/1709.03480, 2017.
[Online]. Available: http://arxiv.org/abs/1709.03480

[26] H. Baier and P. I. Cowling, “Evolutionary MCTS with flexible search horizon,” in Proceedings of the Fourteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE 2018, November 13-17, 2018, Edmonton, Alberta, Canada, J. P. Rowe and G. Smith, Eds. AAAI Press, 2018, pp. 2–8. [Online]. Available: https://aaai.org/ocs/index.php/AIIDE/AIIDE18/paper/view/18113

[27] H. Baier and P. I. Cowling, “Evolutionary MCTS for multi-action adversarial games,” in 2018 IEEE Conference on Computational Intelligence and Games, CIG 2018, Maastricht, The Netherlands, August 14-17, 2018. IEEE, 2018, pp. 1–8. [Online]. Available: https://doi.org/10.1109/CIG.2018.8490403

[28] N. Justesen, T. Mahlmann, S. Risi, and J. Togelius, “Playing multiaction adversarial games: Online evolutionary planning versus tree search,” IEEE Transactions on Games, vol. 10, no. 3, pp. 281–291, Sept 2018.

[29] S. Wang, M. Ding, and S. Li, “Hex game system based on p-mcts,” in 2018 Chinese Control And Decision Conference (CCDC), June 2018, pp. 6639–6642.

[30] C. Berge, Some Remarks about a Hex Problem. Boston, MA: Springer US, 1981,
pp. 25–27. [Online]. Available: https://doi.org/10.1007/978-1-4684-6686-7\_3

[31] M.-J. Kim and C. W. Ahn, “Hybrid fighting game AI using a genetic algorithm
and monte carlo tree search,” in Proceedings of the Genetic and Evolutionary Computation Conference Companion, ser. GECCO ’18. New York, NY, USA: ACM, 2018, pp. 129–130. [Online]. Available: http://doi.acm.org/10.1145/3205651.3205695

[32] D. Perez, S. Samothrakis, S. Lucas, and P. Rohlfshagen, “Rolling horizon evolution versus tree search for navigation in single-player real-time games,” in Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation, ser.GECCO ’13. New York, NY, USA: ACM, 2013, pp. 351–358. [Online]. Available: http://doi.acm.org/10.1145/2463372.2463413

[33] “How to up your game: Pattern play and safeties.” [Online]. Available: https://www.pooldawg.com/article/pooldawg-library/how-to-up-your-game-pattern-play-and-safeties

[34] “How to select key balls in 8-ball, from disc I of VEEB.” [Online]. Available: https://www.youtube.com/watch?v=8tL2g9p0hOY

[35] “8-ball pool - how to choose good ”key” balls, from VEPS III (NV b.78).” [Online]. Available: https://www.youtube.com/watch?v=z0bXsBbCyJM

[36] J.-F. Landry and J.-P. Dussault, “AI optimization of a billiard player,” J. Intell. Robotics Syst., vol. 50, no. 4, pp. 399–417, Dec. 2007. [Online]. Available: http://dx.doi.org/10.1007/s10846-007-9172-7

[37] “Keras: The python deep learning library.” [Online]. Available: https://keras.io/

[38] “An end-to-end open source machine learning platform.” [Online]. Available:
https://www.tensorflow.org/

[39] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014.

[40] J.-S. Wang and W.-K. Tai, “A study on human-like billiards AI bot based on deep imitation learning,” Master’s thesis, National Taiwan University of Science and Technology, 2019.

[41] “Web development, one drop at a time.” [Online]. Available: http://flask.pocoo.org/

[42] “A complete node-based visual behaviour authoring framework for unity.” [Online]. Available: https://nodecanvas.paradoxnotion.com/

[43] I. Umarov and M. Mozgovoy, “Believable and effective AI agents in virtual
worlds: Current state and future perspectives,” International Journal of Gaming and Computer-Mediated Simulations, vol. 4, pp. 37–59, 08 2012.

[44] S. Rigby and R. Ryan, “The player experience of need satisfaction (pens) model,” Immersyve Inc, pp. 1–22, 2007.

全文公開日期 2024/08/20 (校內網路)
全文公開日期 2024/08/20 (校外網路)
全文公開日期 2024/08/20 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文