簡易檢索 / 詳目顯示

研究生: 黃鼎富
Ting-Fu Huang
論文名稱: 亂流環境下基於深度強化學習之無人機自主飛行控制系統
Deep Reinforcement Learning based Autonomous Flight Control System for Drones under Turbulence Environment
指導教授: 李敏凡
Min-Fan Ricky Lee
口試委員: 李敏凡
柯正浩
許聿靈
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2023
畢業學年度: 112
語文別: 英文
論文頁數: 76
中文關鍵詞: 自主空中載具智慧型機器人物件偵測強化學習軌跡優化
外文關鍵詞: Autonomous aerial vehicles, intelligent robots, object detection, reinforcement learning, trajectory optimization
相關次數: 點閱:91下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

無人機在亂流環境中的飛行控制涉及軌跡追蹤和避障。這些控制行為根據新資訊和不斷變化的環境調整其模型和行動,以實現更好的績效和目標。然而,傳統的無人機行為控制建模數學方法面臨建模困難、高非線性度、動作空間複雜等挑戰。此外,在許多人工智慧模型中,其內部運作和決策過程通常被視為黑盒子。本文提出了一種基於強化學習的亂流應變之無人機控制系統。穿越或躲避等決定是根據檢測到的亂流強度做出的。透過調整模型和演算法以適應環境和數據的變化,這些決策會自動調整。此外,採用模糊邏輯推理系統使黑盒子模型變得透明,從而能夠理解輸入特徵和輸出預測之間的關係。最後,模型評估所使用的指標包括長期風險跨度和長期風險跨度,以及跨時間的離散度,評估收斂程度。所有這些指標都接近零,表示模型訓練結果符合預期。與其他類似研究的結果並使用均方根誤差評估軌跡誤差相比,差異很小,顯示使用強化學習訓練的模型的有效性。此外,均方根誤差突顯了模型的輸出和模糊邏輯推理系統的輸出之間的誤差,觀察結果顯示它們的輸出具有類似的模式。


The flight control of drones in turbulent environments involves trajectory tracking and obstacle avoidance. These control behaviors adjust their models and actions based on new information and the constantly changing environment to achieve better performance and goal attainment. However, traditional mathematical methods used for modeling drone behavior control suffer from challenges such as difficulty in modeling, high nonlinearity, and complex action spaces. Moreover, in many AI models, their internal operations and decision-making processes are often viewed as black boxes. This paper presents a drone control system based on reinforcement learning for turbulent conditions. Decisions, such as crossing or evading, are made based on detected turbulence levels. These decisions are automatically adjusted by adapting their models and algorithms to changes in the environment and data. Furthermore, a fuzzy support system is employed to make the black-box model transparent, enabling the understanding of the relationship between input features and output predictions. Finally, the indicators used for model evaluation include the long-term risk across and the long-term risk across, along with the dispersion across time, which assesses the degree of convergence. All of these indicators closely approach zero, suggesting that the model training results align with expectations. When compared to results from other similar studies and assessing trajectory errors using RMSE, the differences are minimal, indicating the effectiveness of the model trained using reinforcement learning. In addition, the RMSE highlights the error between the model's output and the fuzzy inference system's output, with observations revealing a similar pattern in their outputs.

致謝 I 摘要 II ABSTRACT III Table of Contents IV List of Figures VI List of Tables IX Chapter 1 Introduction 1 Chapter 2 Method 6 2.1 Related Work 6 2.1.1 Reinforcement Learning 6 2.1.2 Obstacle Avoidance 7 2.1.3 Trajectory Tracking 8 2.2 Problem Statement 10 2.3 Workflow 13 2.4 Environment 15 2.5 Reward Shaping 22 2.6 Policy 26 2.7 Model Training 28 2.8 Explainable AI 31 2.9 Deploy 37 Chapter 3 Result 40 3.1 UAV Hardware 41 3.2 Simulation Result and Analysis 42 3.3 Test Result of Real Flight 56 3.4 Explainable AI 62 Chapter 4 Discussion 66 4.1 Interpretation the Research Findings 66 4.2 Relevance to the Literature 67 4.3 Hypothesis Validation 68 4.4 Significance and Impact 69 4.5 Limitation and Future Work 70 References. 72

[1] B. Rubi, R. Perez, and B. Morcego, "A Survey of Path Following Control Strategies for UAVs Focused on Quadrotors," Journal of Intelligent and Robotic Systems: Theory and Applications, vol. 98, no. 2, pp. 241-265, 2020.
[2] J. N. Yasin, S. A. S. Mohamed, M. H. Haghbayan, J. Heikkonen, H. Tenhunen, and J. Plosila, "Unmanned Aerial Vehicles (UAVs): Collision Avoidance Systems and Approaches," IEEE Access, vol. 8, pp. 105139-105155, 2020.
[3] A. Hossam and A. El-Badawy, "Mu-based trajectory tracking control for a quad-rotor UAV," Control Theory and Technology, vol. 20, no. 4, pp. 536-554, 2022.
[4] S. Benders, A. Wenz, and T. A. Johansen, "Adaptive Path Planning for Unmanned Aircraft Using In-flight Wind Velocity Estimation," in 2018 International Conference on Unmanned Aircraft Systems, ICUAS 2018, June 12, 2018 - June 15, 2018, 650 N. Pearl Str., Dallas, TX, United states, 2018: Institute of Electrical and Electronics Engineers Inc., in 2018 International Conference on Unmanned Aircraft Systems, ICUAS 2018, pp. 483-492.
[5] S. Das, N. Agarwal, D. Venugopal, F. T. Sheldon, and S. Shiva, "Taxonomy and Survey of Interpretable Machine Learning Method," in 2020 IEEE Symposium Series on Computational Intelligence (SSCI), 1-4 Dec. 2020 2020, pp. 670-677, doi: 10.1109/SSCI47803.2020.9308404.
[6] T. Hao, K. Hwang, J. Zhan, Y. Li, and Y. Cao, "Scenario-Based AI Benchmark Evaluation of Distributed Cloud/Edge Computing Systems," IEEE Transactions on Computers, vol. 72, no. 3, pp. 719-731, 2023, doi: 10.1109/TC.2022.3176803.
[7] F. Santoso, M. A. Garratt, and S. G. Anavatti, "Hybrid PD-Fuzzy and PD Controllers for Trajectory Tracking of a Quadrotor Unmanned Aerial Vehicle: Autopilot Designs and Real-Time Flight Tests," IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 51, no. 3, pp. 1817-1829, 2021.
[8] X. Hu and J. Liu, "An obstacle avoidance design of UAV based on genetic algorithm," in 3rd International Conference on Mechanical, Electric and Industrial Engineering, MEIE 2020, June 18, 2020, Virtual, Online, China, 2020, vol. 1633: IOP Publishing Ltd, in Journal of Physics: Conference Series, 1 ed.
[9] B. Rubi, B. Morcego, and R. Perez, "Quadrotor Path Following and Reactive Obstacle Avoidance with Deep Reinforcement Learning," Journal of Intelligent and Robotic Systems: Theory and Applications, vol. 103, no. 4, 2021, doi: 10.1007/s10846-021-01491-2.
[10] B. S. Chen, Y. C. Liu, M. Y. Lee, and C. L. Hwang, "Decentralized H PID Team Formation Tracking Control of Large-Scale Quadrotor UAVs Under External Disturbance and Vortex Coupling," IEEE Access, vol. 10, pp. 108169-108184, 2022.
[11] Z. Yu, N. Qi, M. Huo, Z. Fan, and W. Yao, "Fast Cooperative Trajectory Generation of Unmanned Aerial Vehicles Using a Bezier Curve-Based Shaping Method," IEEE Access, vol. 10, pp. 1626-1636, 2022.
[12] C. Dong, Z. Yu, X. Chen, H. Chen, Y. Huang, and Q. Huang, "Adaptability Control Towards Complex Ground Based on Fuzzy Logic for Humanoid Robots," IEEE Transactions on Fuzzy Systems, vol. 30, no. 6, pp. 1574-1584, 2022.
[13] A. Y. Alkayas, M. Chehadeh, A. Ayyad, and Y. Zweiri, "Systematic Online Tuning of Multirotor UAVs for Accurate Trajectory Tracking Under Wind Disturbances and In-Flight Dynamics Changes," IEEE Access, vol. 10, pp. 6798-6813, 2022.
[14] S. Wu, R. Li, Y. Shi, and Q. Liu, "Vision-Based Target Detection and Tracking System for a Quadcopter," IEEE Access, vol. 9, pp. 62043-62054, 2021.
[15] J. Cui, Z. Ding, Y. Deng, A. Nallanathan, and L. Hanzo, "Adaptive UAV-Trajectory Optimization Under Quality of Service Constraints: A Model-Free Solution," IEEE Access, vol. 8, pp. 112253-112265, 2020.
[16] N. Abo Mosali, S. S. Shamsudin, O. Alfandi, R. Omar, and N. Al-Fadhali, "Twin Delayed Deep Deterministic Policy Gradient-Based Target Tracking for Unmanned Aerial Vehicle with Achievement Rewarding and Multistage Training," IEEE Access, vol. 10, pp. 23545-23559, 2022.
[17] M. N. Alpdemir, "Tactical UAV path optimization under radar threat using deep reinforcement learning," Neural Computing and Applications, vol. 34, no. 7, pp. 5649-5664, 2022.
[18] B. Ma et al., "Deep Reinforcement Learning of UAV Tracking Control Under Wind Disturbances Environments," IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1-13, 2023.
[19] J. Moon, S. Papaioannou, C. Laoudias, P. Kolios, and S. Kim, "Deep Reinforcement Learning Multi-UAV Trajectory Control for Target Tracking," IEEE Internet of Things Journal, vol. 8, no. 20, pp. 15441-15455, 2021.
[20] D. Falanga, K. Kleber, and D. Scaramuzza, "Dynamic obstacle avoidance for quadrotors with event cameras," Science Robotics, vol. 5, no. 40, 2020.
[21] E. M. B. Laoula, O. Elfahim, M. Youssfi, and O. Bouattane, "Drone path optimization in complex environment based on Q-learning algorithm," in 2022 International Conference on Intelligent Systems and Computer Vision (ISCV), 18-20 May 2022 2022, pp. 1-7, doi: 10.1109/ISCV54655.2022.9806077.
[22] H. Nguyen, S. Thudumu, H. Du, K. Mouzakis, and R. Vasa, "UAV Dynamic Object Tracking with Lightweight Deep Vision Reinforcement Learning," Algorithms, vol. 16, no. 5, 2023.
[23] H. Bayerlein, M. Theile, M. Caccamo, and D. Gesbert, "UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach," in GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 7-11 Dec. 2020 2020, pp. 1-6, doi: 10.1109/GLOBECOM42002.2020.9322234.
[24] B. Zhou, W. Wang, Z. Wang, and B. Ding, "Neural Q Learning Algorithm based UAV Obstacle Avoidance," in 2018 IEEE CSAA Guidance, Navigation and Control Conference, CGNCC 2018, August 10, 2018 - August 12, 2018, Xiamen, China, 2018: Institute of Electrical and Electronics Engineers Inc., in 2018 IEEE CSAA Guidance, Navigation and Control Conference, CGNCC 2018.
[25] S. Krishnan, B. Boroujerdian, W. Fu, A. Faust, and V. J. Reddi, "Air Learning: a deep reinforcement learning gym for autonomous aerial robot visual navigation," Machine Learning, vol. 110, no. 9, pp. 2501-2540, 2021.
[26] Y. Ma and Y. Xu, "A DDQN-Based Path Planning Method for Multi-UAVs in a 3D Indoor Environment," in 4th International Conference on Control and Robotics, ICCR 2022, December 2, 2022 - December 4, 2022, Virtual, Online, China, 2022: Institute of Electrical and Electronics Engineers Inc., in 2022 4th International Conference on Control and Robotics, ICCR 2022, pp. 476-480.
[27] L. Bruhwiler, C. Fu, H. Huang, L. Longhi, and R. Weibel, "Predicting individuals' car accident risk by trajectory, driving events, and geographical context," Computers, Environment and Urban Systems, vol. 93, 2022.
[28] B. Li and Y. Wu, "Path Planning for UAV Ground Target Tracking via Deep Reinforcement Learning," IEEE Access, vol. 8, pp. 29064-29074, 2020.
[29] P. Yao and Z. Gao, "UAV/USV Cooperative Trajectory Optimization Based on Reinforcement Learning," in 2022 Chinese Automation Congress, CAC 2022, November 25, 2022 - November 27, 2022, Xiamen, China, 2022, vol. 2022-January: Institute of Electrical and Electronics Engineers Inc., in Proceedings - 2022 Chinese Automation Congress, CAC 2022, pp. 4711-4715.
[30] P. Corke, "Robotics, Vision and Control Fundamental Algorithms in MATLAB® 3rd edition 2023," vol. 147, (Springer Tracts in Advanced Robotics: Springer Science and Business Media Deutschland GmbH, 2023, pp. 1-111.
[31] T. Battisti and R. Muradore, "A velocity obstacles approach for autonomous landing and teleoperated robots," Autonomous Robots, vol. 44, no. 2, pp. 217-232, 2020.

無法下載圖示 全文公開日期 2025/12/20 (校內網路)
全文公開日期 2025/12/20 (校外網路)
全文公開日期 2025/12/20 (國家圖書館:臺灣博碩士論文系統)
QR CODE