基於強化式學習動態視窗法之移動機器人自主導航系統開發

簡易檢索 / 詳目顯示

回結果列表

研究生：	嚴少威 Shao-Wei Yan
論文名稱：	基於強化式學習動態視窗法之移動機器人自主導航系統開發 Development of Reinforcement Learning Based Dynamic Window Approach for Autonomous Navigation Systems of Mobile Robots
指導教授：	郭重顯 Chung-Hsien Kuo
口試委員:	黃漢邦 Han-Pang Huang 林其禹陸敬互郭重顯 Chung-Hsien Kuo
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2021
畢業學年度：	109
語文別：	英文
論文頁數：	89
中文關鍵詞：	自主導航、路徑規劃、動態視窗演算法、強化式學習、深度學習
外文關鍵詞：	Autonomous Navigation, Path Planning, Dynamic Window Approach, Reinforcement Learning, Deep Learning
相關次數：	點閱：251 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在4.0工廠中，產線規劃變動快速且多變，移動機器人必須具備在複雜區域像是‘U’、 ‘V’ 或 ‘G’型障礙物中導航與自主判斷的能力，如閃避障礙物功能，確保無人搬運車本身、貨物與人員安全。動態視窗演算法 ( Dynamic Window Approach, DWA ) 是一種常見的避障方法，然而此演算法原本之評估能力不足，無法再複雜環境中脫困，故改良其評價函數增強其導航能力。但是DWA缺乏選擇評價函數權重之能力，故本論文以強化學習（Reinforcement Learning, RL）方式定義用於強化式學習之狀態、動作及獎勵，來自適應學習DWA權重，提升DWA的導航能力。
另一方面，當移動機器人在長廊地形中進行導航，對於同時定位與建圖 ( Simultaneous Localization and Mapping, SLAM ) 演算法來說，雷達點雲特徵會過於相似，SLAM演算法無法利用點雲特徵在地圖中定位，發生長廊效應，故本文提出結合深度學習網路來識別雷達點雲特徵，判斷出可能會發生長廊效應之區域，自主切換定位系統，避免長廊效應。
為了證明本論文方法之有效性，將實驗分成四個部分，第一個部分利用模擬移動機器人驗證DWA改良與新增的評價函數效果；第二個部分利用強化式學習改良動態視窗演算法與文獻結果做比較；第三個部分根據真實複雜環境驗證強化式學習改良動態視窗演算法之可行性；第四個部分會驗證深度學習識別雷達點雲特徵切換定位系統在長廊地形之可行性。最後透過實驗證明強化式學習改良動態視窗演算法在複雜地形的導航成功率高於文獻2.42%且深度學習識別雷達點雲特徵切換定位系統在長廊地形的完成度比純SLAM定位多了36.25%。

In the 4.0 factory, the production line planning changes rapidly and changeably. The mobile robots must have the ability to navigate in complex areas such as ‘U’, ‘V’ or ‘G’ obstacles and make autonomous judgments. Dynamic Window Approach (DWA) is a common obstacle avoidance method. Nevertheless, the basic evaluation ability of this algorithm is insufficient, and it cannot escape in a complex environment. Therefore, its evaluation function is improved to enhance its navigation ability. However, DWA lacks the ability to choose the weight of the evaluation function, so this study uses Reinforcement Learning (RL) to define the state, action and reward for RL to adaptively learn DWA weights and improve the navigation ability of DWA.
On the other hand, when the mobile robot is navigating in the corridor, for the Simultaneous Localization and Mapping (SLAM) algorithm, the two-dimension point cloud features will be too similar, and the SLAM algorithm cannot use the point cloud features locate on the map, the corridor effect occurs, so this study proposes that the deep learning recognition LiDAR point cloud feature, determine the area where the corridor effect may occur, and switch the localization system autonomously to avoid the corridor effect.
In order to prove the effectiveness of the method in this study, the experiment is divided into four parts. The first part uses a simulated mobile robot to verify the effect of the DWA improvement and the new evaluation function; The second part uses RL modified DWA to compare with the results of the literature; The third part verifies the feasibility of RL modified DWA based on the real complex environment; The fourth part verifies the feasibility of the deep learning recognition LiDAR point cloud feature switching localization. Finally, through experiments, it is proved that the navigation success rate of RL modified DWA in complex terrain is 2.42% higher than that of the literature, and the deep learning recognition LiDAR point cloud feature switching localization system in corridor is 36.25% more complete than the SLAM localization system.

指導教授推薦書    i
口試委員會審定書    ii
誌謝    iii
摘要    iv
Abstract    v
List of Tables    ix
List of Figures    x
Nomenclature    xiii
Chapter 1    Introduction    1
1    Motivation and Purpose    1
2    Literature Review    3
2.1    Related Research of Path Planning    3
2.2    Related Research of Dynamic Window Approach    4
2.3    Related Research of Reinforcement Learning    5
2.4    Related Research of Deep Learning Graph Recognition and Kidnapped Robot Problem    6
3    Organization of the Thesis    7
Chapter 2    System Architecture and Operation    8
1    System Organization    8
2    Robot Operating System (ROS)    13
3    System Operation Process and Design    14
4    Simultaneous Localization and Mapping (Cartographer SLAM)    16
5    Deep Learning to Recognize Point Cloud Feature    18
Chapter 3    Reinforcement Learning Combined with Modified Dynamic Window Approach System Design    25
1    Introduction to DWA    27
2    Modified DWA for Trapper Environment    30
2.1    Introducing Improve DWA Literature    30
2.2    Modified DWA    31
3    Applying Reinforcement Learning for Modified DWA    39
3.1    Reinforcement Learning    39
3.2    Define the State    40
3.3    Define the Action    41
3.4    Define the Reward Functions    41
3.5    The Operation of the Proposed RL    42
Chapter 4    Experiments and Results    44
1    Part 1：Performance Improved on the Modified DWA Evaluation Functions    45
1.1    Experiment 1.1    46
1.2    Experiment 1.2    47
2    Part 2：Simulations on Maze Maps Using RL Methods    50
2.1    Experiment 2.1：RL Agent Training    50
2.2    Experiment 2.2：Comparisons of Successful Rates Based on Random Initial Start Positions    52
2.3    Experiment 2.3：A Varied Maze Map for Model Adoption to Different Environment    53
3    Part 3：Comparative evaluations in terms of ‘U’, ‘G’ and ‘U-V-Compound’ Maps for Simulations and Real Robot Navigation    54
3.1    Experiment 3.1：‘U’ Shaped Obstacle    57
3.2    Experiment 3.2：‘G’ Shaped Obstacle    59
3.3    Experiment 3.3：‘U-V-Compound’ Shaped Obstacle    61
3.4    Summary of Experiments    63
4    Part 4：Verify the Method of Deep Learning to Recognize Lidar Point Cloud Feature Switch Localization System Solves the Corridor Effect    66
4.1    Experiment 4.1    66
4.2    Experiment 4.2    69
Chapter 5    Conclusions and Future Works    71
References    72


                                

[1] P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic determination of minimum cost paths,” IEEE transactions on Systems Science and Cybernetics, vol. 4, no. 2, pp. 100-107, 1968.
[2] R. Gonzalez, P. Jayakumar, and K. Iagnemma, “Stochastic mobility prediction of ground vehicles over large spatial regions: a geostatistical approach,” Autonomous Robots, vol. 41, no. 2, pp. 311-331, 2017.
[3] C. Ju, Q. Luo, and X. Yan, “Path Planning Using an Improved A-star Algorithm,” International Conference on Prognostics and System Health Management (PHM), Jinan, China, pp. 23-26, 2020.
[4] T. Zheng, Y. Xu, and D. Zheng, “AGV Path Planning based on Improved A-star Algorithm,” Information Management Communicates Electronic and Automation Control Conference (IMCEC), Chongqing, China, pp. 1534-1538, 2019.
[5] O. Khatib, “Real-time obstacle avoidance for manipulators and mobile robots,” Proceedings. IEEE International Conference on Robotics and Automation, St. Louis, MO, USA, pp. 500-505, 1985.
[6] Z. Ma, O. Postolache, and Y. Yang, “Obstacle Avoidance for Unmanned Vehicle based on a 2D LIDAR,” International Conference on Sensing and Instrumentation in IoT Era (ISSI), Lisbon, Portugal, pp. 1-6, 2019, doi: 10.1109/ISSI47111.2019.9043674.
[7] M. Imran, and F. Kunwar, “A hybrid path planning technique developed by integrating global and local path planner,” International Conference on Intelligent Systems Engineering (ICISE), Islamabad, Pakistan, pp. 118-122, 2016.
[8] K. Sui, P. Wu, and M. Liu, “Research on path planning method of forging handling robot based on combined strategy,” IEEE International Conference on Power Electronics, Computer Applications (ICPECA), Shenyang, China, pp. 292-295, 2021.
[9] D. Fox, W. Burgard, and S. Thrun, “Controlling synchro-drive robots with the dynamic window approach to collision avoidance,” Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Osaka, Japan, pp. 1280-1287, 1996.
[10] Z. Chen, Z. Wang, M. Wu, H. Chen, and W. Zhang, “Improved Dynamic Window Approach for Dynamic Obstacle Avoidance of Quadruped Robots,” The 46th Annual Conference of the IEEE Industrial Electronics Society, Singapore, pp. 2780-2785, 2020.

[11] L. Tianyu, Y. Ruixin, W. Guangrui, and S. Lei, “Local Path Planning Algorithm for Blind-guiding Robot Based on Improved DWA Algorithm,” Chinese Control And Decision Conference (CCDC), Nanchang, China, pp. 6169-6173, 2019.
[12] P. Saranrittichai, N. Niparnan, and A. Sudsang, “Robust local obstacle avoidance for mobile robot based on Dynamic Window approach,” International Conference on Electrical Engineering/Electronic, Computer, Telecommunications and Information Technology, Krabi, Thailand, pp. 1-4, 2013, doi: 10.1109/ECTICon.2013.6559615.
[13] D. H. Lee, S. S. Lee, C. K. Ahn, P. Shi, and C. C Lim, “Finite Distribution Estimation-based Dynamic Window Approach to Reliable Obstacle Avoidance of Mobile Robot,” IEEE Transactions on Industrial Electronics, 2020, doi: 10.1109/TIE.2020.3020024.
[14] J. Ballesteros, C. Urdiales, A. B. M. Velasco, and R. J. Gonzalo, “A Biomimetical Dynamic Window Approach to Navigation for Collaborative Control,” IEEE Transactions on Human-Machine Systems, vol. 47, no. 6, pp. 1123-1133, 2017.
[15] H. Zhang, C. L. Sun, Z. J. Zheng, W. An, D. Q. Zhou, and J. J. Wu, “A Modified Dynamic Window Approach to Obstacle Avoidance Combined with Fuzzy Logic,” 14th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES), Guiyang, China, pp. 523-526, 2015.
[16] J. L. Yu, “An adaptive gain parameters algorithm for path planning based on reinforcement learning,” International Conference on Machine Learning and Cybernetics, Guangzhou, China, vol. 6, pp. 3557-3562, 2005.
[17] H. Kim, and W. Lee, “Real-Time Path Planning Through Q-learning's Exploration Strategy Adjustment,” 2021 International Conference on Electronics, Information, and Communication (ICEIC), Jeju, Korea (South), pp. 1-3, 2021, doi: 10.1109/ICEIC51217.2021.9369749.
[18] X. Liao, Y. Wang, Y. Xuan, and D. Wu, “AGV Path Planning Model based on Reinforcement Learning,” Chinese Automation Congress (CAC), Shanghai, China, pp. 6722-6726, 2020.
[19] L. Chang, L. Shan, C. Jiang, and Y. Dai, “Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment,” Autonomous Robots, vol. 45, pp. 51-76, 2020.
[20] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 2818-2826, 2015.
[21] J. Redmon, and A. Farhadi, “YOLOv3: An Incremental Improvement,” 2018; [http://arxiv.org/abs/1804.02767 arXiv:1804.02767].
[22] J. Chen, P. Ye, and Z. Sun, “Pedestrian Detection and Tracking Based on 2D Lidar,” International Conference on Systems and Informatics (ICSAI), Shanghai, China, pp. 421-426, 2019.
[23] R. C. Luo, and T. J. Hsiao, “Kidnapping and Re-Localizing Solutions for Autonomous Service Robotics,” Annual Conference of the IEEE Industrial Electronics Society, Washington, DC, USA, pp. 2552-2557, 2018.
[24] R. C. Luo, K. C. Yeh, and K. H. Huang, “Resume navigation and re-localization of an autonomous mobile robot after being kidnapped,” IEEE International Symposium on Robotic and Sensors Environments (ROSE), Washington, DC, USA, pp. 7-12, 2013.

全文公開日期 2024/07/10 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文