簡易檢索 / 詳目顯示

研究生: 李哲維
Che-Wei Lee
論文名稱: 基於強化學習之蛇形機器人運動控制
Reinforcement Learning Motion Control for Snake Robot
指導教授: 李敏凡
Min-Fan Ricky Lee
口試委員: 湯梓辰
Joni-Tzuchen Tang
柯正浩
Kevin Cheng-Hao Ko
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 96
中文關鍵詞: 運動控制多智能體系統強化學習蛇形機器人
外文關鍵詞: Motion control, Multi-agent systems, Reinforcement learning, Snake robots
相關次數: 點閱:336下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在過去幾年中,蛇形機器人之運動控制是一個具有非常有挑戰性的問題。這問
    題其中包括了非獨立性關節以及連接機構等相關技術問題,所以它無法簡單應用到
    許多領域之中。對於這篇挑戰論文,建模方法主要是分析運動學和動力學來計算轉
    移函數。主要的不確定因素是環境、目標物體和硬件,因此這並不容易。本文提出,
    建模方法採用強化學習算法。它將按照四個步驟的關鍵詞依次進行,包括環境、獎
    勵、策略、訓練。在結果中,將挑戰論文的參數進行比較,並對現象進行解釋。在
    未來,可根據實驗方法來取得最佳控制模組在現實上蛇形機器人使用,訓練控制模
    組方法可適用於不同領域。例如,在水下蛇形機器人滑行、追蹤目標、避開障礙物
    等等。


    For the past few years, the motion control function for snake robots has been a
    challenging issue, which includes depending on joints and the connection mechanism in the body, so it is not ranged to apply in a different field. For the challenge paper, the modeling approach is mainly to analyze the kinematics and dynamics to compute the transfer functions. The main uncertainties factors are the environment, the target object, and the hardware, therefore it is not easy. This paper proposes that the modeling approach uses a reinforcement learning algorithm. It will be done sequentially according to four steps keywords, including environment, reward, policy and training. In the results, the parameters would be compared against the challenge papers. The differences are compared and the phenomena are explained.
    In the future, the performance of the proposed controller, which is experimentally validated with the simulation, could use in the real snake robot. The method of training module could apply in a different field. For example, snake robots could glide underwater, avoid obstacles, and so on.

    致謝................................................................................................................................... I 摘要..................................................................................................................................II ABSTRACT ....................................................................................................................III Table of Contents............................................................................................................ IV List of Figures...................................................................................................................V List of Tables .................................................................................................................. IX Chapter1 Introduction.......................................................................................................1 Chapter2 Methods.............................................................................................................4 2.1 Environment.........................................................................................................6 2.2 Reward...............................................................................................................10 2.3 Policy.................................................................................................................23 2.4 Train...................................................................................................................29 Chapter3 Results..............................................................................................................44 Chapter4 Discussions.......................................................................................................79 Chapter5 Conclusions......................................................................................................81 References.......................................................................................................................82

    [1] E. Kelasidi, P. Liljeback, K. Pettersen, and T. Gravdahl, “Integral Line-of-Sight
    Guidance for Path Following Control of Underwater Snake Robots: Theory and
    Experiments,” IEEE Transactions on Robotics, vol. 33, no. 3, pp. 610-628, 2017.
    [2] G. Wang, W. Yang, Y. Shen, H. Shao, and C. Wang, “Adaptive Path Following of
    Underactuated Snake Robot on Unknown and Varied Frictions Ground: Theory and
    Validations,” IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 4273-4280, 2018.
    [3] W. Li, M. Shen, A. Gao, G. Yang, and B. Lo, “Towards a Snake-Like Flexible Robot for
    Endoscopic Submucosal Dissection,” IEEE Transactions on Medical Robotics and
    Bionics, vol. 3, no. 1, pp. 257-260, 2021.
    [4] M. Nakajima, M. Tanaka, and K. Tanaka, “Simultaneous Control of Two Points for
    Snake Robot and Its Application to Transportation,” IEEE Robotics and Automation
    Letters, vol. 5, no. 1, pp. 111-118, 2020.
    [5] P. Racioppo and P. Ben-Tzvi, “Design and Control of a Cable-Driven Articulated
    Modular Snake Robot,” IEEE/ASME Transactions on Mechatronics, vol. 24, no. 3, pp.
    893-901, 2019.
    [6] C. Branyan, R.L. Hatton, and Y. Mengüç, “Snake-Inspired Kirigami Skin for Lateral
    Undulation of a Soft Snake Robot,” IEEE Robotics and Automation Letters, vol. 5, no.
    2, pp. 1728-1733, 2020.
    [7] X. Qi, H. Shi, T. Pinto, and X. Tan, “A Novel Pneumatic Soft Snake Robot Using
    Traveling-Wave Locomotion in Constrained Environments,” IEEE Robotics and
    Automation Letters, vol. 5, no. 2, pp. 1610-1617, 2020.
    [8] E. Milana, B.V. Raemdonck, K. Cornelis, E. Deharne, J.D. Clerck, Y.D. Groof, T.D. Vil,
    B. Gorissen, and D. Reynaerts, “EELWORM: a bioinspired multimodal amphibious soft
    82
    robot,” in Proc. 3rd IEEE International Conference on Soft Robotics, Yale University,
    USA, 2020, pp. 766-771.
    [9] A. Kakogawa, S. Jeon, and S. Ma, “Stiffness Design of a Resonance-Based Planar Snake
    Robot with Parallel Elastic Actuators,” IEEE Robotics and Automation Letters, vol. 3,
    no. 2, pp. 1284-1291, 2018.
    [10] A.M. Kohl, K.Y. Pettersen, E. Kelasidi, and J.T. Gravdahl, “Planar Path Following of
    Underwater Snake Robots in the Presence of Ocean Currents,” IEEE Robotics and
    Automation Letters, vol. 1, no. 1, pp. 383-390, 2016.
    [11] S. Tully, and H. Choset, “A Filtering Approach for Image-Guided Surgery with a
    Highly Articulated Surgical Snake Robot,” IEEE Transactions on Biomedical
    Engineering, vol. 63, no. 2, pp. 392-402, 2016.
    [12] R. Ariizumi, and F. Matsuno, “Dynamic Analysis of Three Snake Robot Gaits,” IEEE
    Transactions on Robotics, vol. 33, no. 5, pp. 1075-1087, 2017.
    [13] Y. Jia, and S. Ma, “A Coach-Based Bayesian Reinforcement Learning Method for
    Snake Robot Control,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 2319-
    2326, 2021.
    [14] Y.A. Baysal, and I.H. Altas, “Modelling and Simulation of a Wheel-Less Snake Robot,”
    in Proc. 7th International Conference on Electrical and Electronics Engineering
    (ICEEE), Antalya, Turkey, 2020, pp. 285-289.
    [15] Y. Takagi, Y. Sueoka, M. Ishikawa, and K. Osuka, “Snake-Like Robot with
    Controllable Side-Thrust Links: Dynamical Modeling and a Variable Undulation
    Motion,” in Proc. IEEE/ASME International Conference on Advanced Intelligent
    Mechatronics (AIM), Auckland, New Zealand, 2018, pp. 63-68.
    [16] X. Zhang, “Adaptive path planning control of snake-like robot based on reinforcement
    tracking learning,” in Proc. 6th International Conference on Intelligent Computing and
    Signal Processing (ICSP), Xi'an, China, 2021, pp. 1901-1095.
    83
    [17] X. Zhang, B. Li, J. Chang, and J. Tang, “Gliding Control of Underwater Gliding SnakeLike Robot Based on Reinforcement Learning,” in Proc. IEEE 8th Annual International
    Conference on CYBER Technology in Automation, Control, and Intelligent Systems
    (CYBER), Tianjin, China, 2018, pp. 323-328.
    [18] R. Wang, W. Xi, X. Guo, and Y. Fang, “Path Following for Snake Robot Using Crawler
    Gait Based on Path Integral Reinforcement Learning,” in Proc. 6th IEEE International
    Conference on Advanced Robotics and Mechatronics (ICARM), Chongqing, China, 2021,
    pp. 192-198.
    [19] K. Qiu, H. Zhang, Y. Lv, Y. Wang, C. Zhou, and R. Xiong, “Reinforcement Learning
    of Serpentine Locomotion for a Snake Robot,” in Proc IEEE International Conference
    on Real-time Computing and Robotics (RCAR), Xining, China, 2021, pp. 468-473.
    [20] K. Ito, S. Kuroe, and T. Kobayashi, “Abstraction of state-action space utilizing
    properties of the body and the environment,” in Proc. 6th IEEE International Conference
    Intelligent Systems, Sofia, Bulgaria, 2012, pp. 114-120.
    [21] K. Ito, A. Takayama, and T. Kobayashi, “Hardware design of autonomous snake-like
    robot for reinforcement learning based on environment,” in Proc. IEEE/RSJ
    International Conference on Intelligent Robots and Systems, St. Louis, MO, USA, 2009,
    pp. 2622-2627.
    [22] F. Enner, D. Rollinson and H. Choset, “Motion estimation of snake robots in straight
    pipes,” in Proc. IEEE International Conference on Robotics and Automation (ICRA),
    Karlsruhe, Germany, 2013, pp. 5168-5173.
    [23] S.N. Khan, T. Mahmood, S.I. Ullah, K. Ali, and A. Ullah, “Motion Planning for a Snake
    Robot using Double Deep Q-Learning,” in Proc. International Conference on Artificial
    Intelligence (ICAI), Islamabad, Pakistan, 2021, pp. 264-270.
    84
    [24] S. Pord, D. Rollinson, A. Willig, and H. Choset, “Online calibration of a compact series
    elastic actuator,” in Proc. American Control Conference, Portland, OR, USA, 2014, pp.
    3329-3334.
    [25] D. Rollinson, Y. Bilgen, B. Brown, F. Enner, S. Ford, C. Layton, J. Rembisz, M.
    Schwerin, A. Willig, P. Velagapudi, and H. Choset, “Design and architecture of a series
    elastic snake robot,” in Proc. IEEE/RSJ International Conference on Intelligent Robots
    and Systems, Chicago, IL, USA, 2014, pp. 4630-4636.
    [26] D. Rollinson, and H. Choset, “Gait-based compliant control for snake robots,” in Proc.
    IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 2013,
    pp. 5138-5143.
    [27] Y. Wang, J. Zhang, X. Cao, D. Zheng, Y. Gao, D.W.K Ng and M.D. Renzo, “Trajectory
    Design for UAV-Based Internet of Things Data Collection: A Deep Reinforcement
    Learning Approach,” IEEE Internet of Things Journal., vol. 9, no. 5, pp. 3899-3912,
    2022.
    [28] S. Josef, A. Degani, “Deep Reinforcement Learning for Safe Local Planning of a
    Ground Vehicle in Unknown Rough Terrain,” IEEE Robotics and Automation Letters,
    vol. 5, pp. 6748-6755, 2020.
    [29] Y. Wu, and Tian, Y. “Training agent for first-person shooter game with actor-critic
    curriculum learning,” in Proc. 5th International Conference on Learning
    Representations (ICLR), Toulon, France, 2017, pp. 2753-2762.
    [30] T. Aotani, T. Kobayashi, and K. Sugimoto, “Bottom-up Multi-agent Reinforcement
    Learning for Selective Cooperation,” in Proc. 2018 IEEE International Conference on
    Systems, Man, and Cybernetics (SMC), Miyazaki, Japan, 2018, pp. 3590-3595.
    [31] J. Chai, and M. Hayashibe, “Motor Synergy Development in High-Performing Deep
    Reinforcement Learning Algorithms,” IEEE Robotics and Automation Letters, vol. 5,
    pp. 1271-1278, 2020.

    無法下載圖示 全文公開日期 2024/09/07 (校內網路)
    全文公開日期 2024/09/07 (校外網路)
    全文公開日期 2024/09/07 (國家圖書館:臺灣博碩士論文系統)
    QR CODE