簡易檢索 / 詳目顯示

研究生: Cecilia
Cecilia
論文名稱: 應用深度確定性策略梯度強化學習方法於結構主動振動控制之研究
Active Structural Control using Reinforcement Learning with Deep Deterministic Policy Gradient Algorithm
指導教授: 陳沛清
Pei-Ching Chen
口試委員: 林子剛
Tzu-Kang Lin
黃謝恭
Shieh-Kung Huang
賴勇安
Yong-An Lai
學位類別: 碩士
Master
系所名稱: 工程學院 - 營建工程系
Department of Civil and Construction Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 129
外文關鍵詞: deep deterministic policy gradient
相關次數: 點閱:203下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報


This research focuses on the application of reinforcement learning (RL) algorithms to active structural control for mitigating structural responses induced by seismic excitation. Specifically, the study explores the use of RL as either secondary or primary controllers in the context of structural vibration suppression using an active mass damper (AMD). The secondary control, developed using the deep deterministic policy gradient (DDPG) algorithm, acts as a supplementary controller to enhance the performance of the primary controller that could be designed by applying classic or modern control theories. The DDPG agent utilizes four observation inputs from the previous time step including the force generated by the primary controller, the acceleration at the top floor, the action taken by the agent, and the combined force of the primary and secondary controllers.
In the case of the RL as a primary controller, the RL algorithm is trained as an acceleration feedback controller using top-floor acceleration as the input. The time interval (dt) for the observation input to the agent plays an important role which considers the natural frequencies of the structure to be controlled. Smaller dt values are recommended for stiffer structures with higher natural frequencies, while larger dt values are suggested for more flexible structures with lower natural frequencies. The combined numerical and experimental studies highlight the implementation of the DDPG agent in suppressing structural vibration responses induced by earthquakes. This research contributes to the advancement of active structural control using RL algorithms, demonstrating the potential of DDPG as both secondary and primary controllers. The findings offer insights into optimizing control strategies for enhancing structural resilience and mitigating seismic-induced vibrations.

Abstract …………………………………………………………………………………. i Acknowledgment ii Contents …………………………………………………………………………………. iii List of Tables …………………………………………………………………………... vii List of Figures …………………………………………………………………………… ix List of Symbols …………………………………………………………………………… xi CHAPTER 1: INTRODUCTION 1 1.1. Research background 1 1.2. Scope and objective 2 1.3. Outline of the thesis 2 CHAPTER 2: LITERATURE STUDY 5 2.1. Active Structural Control Algortihm 5 2.1.1. Optimizied Linear Quadratic Regulator Controller 5 2.1.2. Acceleration Feedback Controller 6 2.2. Application of Machine Learning 6 2.2.1. Application of Supervised Learning 6 2.2.2. Application of Reinforcement Learning 7 CHAPTER 3: METHODOLOGY 9 3.1. Neural Network 9 3.2. Autoregressive Exogenous (ARX) Neural Network 10 3.3. Reinforcement Learning 10 3.3.1. Value Function 11 3.3.2. Actor - Critic 12 3.4. Deep Deterministic Policy Gradient 13 3.5. Bellman Equation 15 3.6. Numerical Model of Strcuture 16 3.7. Performance Indices 16 3.8. Software 17 CHAPTER 4: REINFORCEMENT LEARNING AS SECONDARY CONTROLLER 19 4.1. Methodology 19 4.1.1. Observation Input 19 4.1.2. Reward Function 20 4.2. Numerical Studies 20 4.2.1. Controller Design 21 4.2.1.1. Primary Controller Design 21 4.2.1.2. RL Secondary Controller Design 21 4.2.2. Numerical Result and Analysis 23 4.3. Experimental Studies 23 4.3.1. Experimental Setup 23 4.3.2. System Identification 24 4.3.3. Controller Design 25 4.3.3.1. Main Controller Design 25 4.3.3.2. RL Secondary Controller Design 25 4.3.3.3. AMD Force Tracking Controller 26 4.3.4. Numerical Result and Analysis 27 4.3.5. Experiment Result and Analysis 28 CHAPTER 5: REINFORCEMENT LEARNING AS PRIMARY CONTROLLLER 29 5.1. Methodology 29 5.1.1. Observation Input and Gorund Acceleration 29 5.1.2. Reward Function 30 5.1.2.1. Reward Weighting Factors Tuning 31 5.1.3. Normalization 32 5.1.4. Overtraining 33 5.1.5. Zero Mean Correction 33 5.1.6. Target Update Frequency 33 5.2. Numerical Studies 34 5.2.1. Single Degree of Freedom (SDOF) 34 5.2.1.1. Observation Tuning Based on SDOF Natural Frequency 34 5.2.1.2. Reward Weighting Factors Effect 35 5.2.2. Multiple Degree of Freedom (MDOF) 36 5.2.2.1. Controller Design 37 5.2.2.2. Numerical Analysis and Result 38 5.3. Experimental Studies 38 5.3.1. Experimental Setup 38 5.3.2. System Identification 39 5.3.3. Controller Design 40 5.3.3.1. Reinforcement Learning 40 5.3.3.2. AMD Force Tracking Controller 42 5.3.4. Experiment Result and Analysis 42 CHAPTER 6: SUMMARY AND CONCLUSIONS 43 6.1. Summary 43 6.2. Conclusions 44 REFERENCES 47

[1] Saaed, T. E.; Nikolakopoulos, G.; Jonasson, J.-E.; Hedlund, H. (2015). “A state-of-the-art review of structural control systems,” Journal of Vibration and Control, 21(5), 919-937.
[2] Ikeda, Y., Sasaki, K., Sakamoto, M., and Kobori, T. (2001).“Active mass driver system as the first application of active structural control,” Earthquake Engineering and Structural Dynamics, vol. 30, 1575-1595.
[3] Amini, F., Hazaveh, N. K., and Rad, A. A .(2013).“Wavelet PSO-Based LQR algorithm for optimal structural control using active tuned mass dampers,” Computer-Aided Civil and Infrastructure Engineering, 28, 542-557.
[4] Khaldoon A. Bani-Hani. (2007). “Vibration control of wind-induced response of tall buildings with an active tuned mass damper using neural networks,” Structural Control and Health Monitoring, 14, 83-108.
[5] Chen, P. C., and Chien, K. Y . (2020). “Machine-Learning based optimal seismic control of structure with active mass damper,” Applied Science, 10(5342).
[6] Dyke, S. J., Spencer, B. F., Quast, P., Kaspari, D. C., & Sain, M. K. (1996). “Implementation of an Active Mass Driver Using Acceleration Feedback Control”. Computer-Aided Civil and Infrastructure Engineering, 11(5), 305–323
[7] Chung, L. L., Wu, L. Y., & Jin, T. G. (1998). “Acceleration feedback control of seismic structures”. Engineering Structures, 20(1-2), 62–74.
[8] Sarker, I. H. (2021). “Machine learning: algorithms, real‑world applications and research directions,” SN Computer Science.
[9] Ghaboussi, J., & Joghataie, A. (1995). “Active Control of Structures Using Neural Networks”. Journal of Engineering Mechanics, 121(4), 555–567.
[10] Mnih, V., Kavukcuoglu, K., Silver, D et al. (2013). “Playing atari with deep reinforcement learning,” arXiv preprint, arXiv:1312.5602.
[11] Silver, D., Schrittwieser, J., Simonyan, K. et al. (2017).“Mastering the game of Go without human knowledge,” Nature, 550, 354-359.
[12] Arulkumaran, K., Cully, A., and Togelius, J. (2019), “AlphaStar: an evolutionary computation perspective,” arXiv preprint, arXiv:1902.01724.
[13] Zhang, P., Xiong, L., Yu, Z. et al. (2019).“Reinforcement learning-based end-to-end parking for automatic parking system,” Sensors, 19, 3996.
[14] Haarnoja, T., Ha, S., Zhou, A. et al. (2019).“Learning to Walk via Deep Reinforcement Learning,” arXiv preprint, arXiv:1812.11103.
[15] Eshkevari, S. S., Eshkevari, S. S., Sen, D., Pakzad, S. N.(2023).“Active structural control framework using policy-gradient reinforcement learning,” Engineering Structures, vol. 274, pp.115-122.
[16] Zhang, Y. A., and Zhu, S. (2023).“Novel Model-free Optimal Active Vibration Control Strategy Based on Deep Reinforcement Learning,” Structural Control and Health Monitoring, Article ID 677013.
[17] Yuan, R., Yang, Y., Su, C. et al. (2021) “Research on Vibration Reduction Control Based on Reinforcement Learning,” Advances in Civil Engineering, Article ID 7619214.
[18] Khalatbarisoltani, A., Soleymani, M., and Khodadadi, M. (2019).“Online control of an active seismic system via reinforcement learning,” Structural Control and Health Monitoring.
[19] Watkins, C., and Dayan, P . (1992).“Q-Learning,” Machine Learning, , 8, 279-292.
[20] Lillicrap, T. P., Hunt, J. J., Pritzel, A. et al. (2019).“Continuous control with deep reinforcement learning,” arXiv preprint, arXiv:1509.02971.
[21] Sutton, R. S., and Barto, A. G. (2018).“Reinforcement learning: an introduction (2nd edition),” The MIT Press.
[22] Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., and Palunko, I. (2018) “Reinforcement learning for control: performance, stability, and deep approximators,” Annual Reviews in Control.
[23] Silver, D., Lever, G., Heess, N. et al. (2014). “Deterministic policy gradient algorithms,” In Proceedings of the 31st International Conference on Machine Learning (ICML 14).
[24] Jansen, L. M., and Dyke, S. J. (2020).“Semiactive Control Strategies for MR Dampers: Comparative Study,” Journal of Engineering Mechanics, 128(8), 795–803.
[25] Surjanto, Yohanes K. (2021), “Application of Metaheuristic Optimization for Parameter Design of Building Mass Damper Systems” National Taiwan University of Science and Technology, Master's thesis, Department of Civil and Construction Engineering, supervised by Professor Pei-Ching Chen
[26] Zhou, Che-Wei (2022),”Numerical Simulation and Experimental Validation of Active Structural Control Algorithm using Direct Excitation Method with Machine Learning,”National Taiwan University of Science and Technology, Master's thesis, Department of Civil and Construction Engineering, supervised by Professor Pei-Ching Chen
[27] Chien,Kai-Yi (2020), “Synthesis and Validation of Machine Learning Based Controllers for Active Structural Control” National Taiwan University of Science and Technology, Master's thesis, Department of Civil and Construction Engineering, supervised by Professor Pei-Ching Chen
[28] Kingma, D. P., and Ba, J. (2014) “Adam: A method for stochastic optimization” arXiv preprint arXiv:1412.6980.
[29] Kayen, R., Brandenberg, S. J., Collins, B. D., et al. (2007). “Investigation of the M6.6 Niigata-Chuetsu Oki, Japan, Earthquake of July 16, 2007” Technical Report 1365. U.S. Department of the Interior and U.S. Geological Survey (USGS).
[30] Aochi, H., Ducellier, A., Dupros, F., et al. (2013). “Finite difference simulations of seismic wave propagation for the 2007 Mw 6.6 Niigata-ken Chuetsu-Oki earthquake: Validity of models and reliable input ground motion in the near field,” Pure and Applied Geophysics, 170(1-2), 43–64. Springer Verlag (Germany).

無法下載圖示 全文公開日期 2025/08/07 (校內網路)
全文公開日期 2025/08/07 (校外網路)
全文公開日期 2025/08/07 (國家圖書館:臺灣博碩士論文系統)
QR CODE