應用深度確定性策略梯度強化學習方法於結構主動振動控制之研究

簡易檢索 / 詳目顯示

回結果列表

研究生：	Cecilia Cecilia
論文名稱：	應用深度確定性策略梯度強化學習方法於結構主動振動控制之研究 Active Structural Control using Reinforcement Learning with Deep Deterministic Policy Gradient Algorithm
指導教授：	陳沛清 Pei-Ching Chen
口試委員:	林子剛 Tzu-Kang Lin 黃謝恭 Shieh-Kung Huang 賴勇安 Yong-An Lai
學位類別：	碩士 Master
系所名稱：	工程學院 - 營建工程系 Department of Civil and Construction Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	129
外文關鍵詞：	deep deterministic policy gradient
相關次數：	點閱：203 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

This research focuses on the application of reinforcement learning (RL) algorithms to active structural control for mitigating structural responses induced by seismic excitation. Specifically, the study explores the use of RL as either secondary or primary controllers in the context of structural vibration suppression using an active mass damper (AMD). The secondary control, developed using the deep deterministic policy gradient (DDPG) algorithm, acts as a supplementary controller to enhance the performance of the primary controller that could be designed by applying classic or modern control theories. The DDPG agent utilizes four observation inputs from the previous time step including the force generated by the primary controller, the acceleration at the top floor, the action taken by the agent, and the combined force of the primary and secondary controllers.
In the case of the RL as a primary controller, the RL algorithm is trained as an acceleration feedback controller using top-floor acceleration as the input. The time interval (dt) for the observation input to the agent plays an important role which considers the natural frequencies of the structure to be controlled. Smaller dt values are recommended for stiffer structures with higher natural frequencies, while larger dt values are suggested for more flexible structures with lower natural frequencies. The combined numerical and experimental studies highlight the implementation of the DDPG agent in suppressing structural vibration responses induced by earthquakes. This research contributes to the advancement of active structural control using RL algorithms, demonstrating the potential of DDPG as both secondary and primary controllers. The findings offer insights into optimizing control strategies for enhancing structural resilience and mitigating seismic-induced vibrations.

Abstract    …………………………………………………………………………………. i
Acknowledgment    ii
Contents    …………………………………………………………………………………. iii
List of Tables         …………………………………………………………………………... vii
List of Figures    …………………………………………………………………………… ix
List of Symbols    …………………………………………………………………………… xi
CHAPTER 1:    INTRODUCTION    1
1.    Research background    1
2.    Scope and objective    2
3.    Outline of the thesis    2
CHAPTER 2:    LITERATURE STUDY    5
1.    Active Structural Control Algortihm    5
1.1.    Optimizied Linear Quadratic Regulator Controller    5
1.2.    Acceleration Feedback Controller    6
2.    Application of Machine Learning    6
2.1.    Application of Supervised Learning    6
2.2.    Application of Reinforcement Learning    7
CHAPTER 3:    METHODOLOGY    9
1.    Neural Network    9
2.    Autoregressive Exogenous (ARX) Neural Network    10
3.    Reinforcement Learning    10
3.1.    Value Function    11
3.2.    Actor - Critic    12
4.    Deep Deterministic Policy Gradient    13
5.    Bellman Equation    15
6.    Numerical Model of Strcuture    16
7.    Performance Indices    16
8.    Software    17
CHAPTER 4:    REINFORCEMENT LEARNING AS SECONDARY CONTROLLER    19
1.    Methodology    19
1.1.    Observation Input    19
1.2.    Reward Function    20
2.    Numerical Studies    20
2.1.    Controller Design    21
2.1.1.    Primary Controller Design    21
2.1.2.    RL Secondary Controller Design    21
2.2.    Numerical Result and Analysis    23
3.    Experimental Studies    23
3.1.    Experimental Setup    23
3.2.    System Identification    24
3.3.    Controller Design    25
3.3.1.    Main Controller Design    25
3.3.2.    RL Secondary Controller Design    25
3.3.3.    AMD Force Tracking Controller    26
3.4.    Numerical Result and Analysis    27
3.5.    Experiment Result and Analysis    28
CHAPTER 5:    REINFORCEMENT LEARNING AS PRIMARY CONTROLLLER    29
1.    Methodology    29
1.1.    Observation Input and Gorund Acceleration    29
1.2.    Reward Function    30
1.2.1.    Reward Weighting Factors Tuning    31
1.3.    Normalization    32
1.4.    Overtraining    33
1.5.    Zero Mean Correction    33
1.6.    Target Update Frequency    33
2.    Numerical Studies    34
2.1.    Single Degree of Freedom (SDOF)    34
2.1.1.    Observation Tuning Based on SDOF Natural Frequency    34
2.1.2.    Reward Weighting Factors Effect    35
2.2.    Multiple Degree of Freedom (MDOF)    36
2.2.1.    Controller Design    37
2.2.2.    Numerical Analysis and Result    38
3.    Experimental Studies    38
3.1.    Experimental Setup    38
3.2.    System Identification    39
3.3.    Controller Design    40
3.3.1.    Reinforcement Learning    40
3.3.2.    AMD Force Tracking Controller    42
3.4.    Experiment Result and Analysis    42
CHAPTER 6:    SUMMARY AND CONCLUSIONS    43
1.    Summary    43
2.    Conclusions    44
REFERENCES    47


                                

[1] Saaed, T. E.; Nikolakopoulos, G.; Jonasson, J.-E.; Hedlund, H. (2015). “A state-of-the-art review of structural control systems,” Journal of Vibration and Control, 21(5), 919-937.
[2] Ikeda, Y., Sasaki, K., Sakamoto, M., and Kobori, T. (2001).“Active mass driver system as the first application of active structural control,” Earthquake Engineering and Structural Dynamics, vol. 30, 1575-1595.
[3] Amini, F., Hazaveh, N. K., and Rad, A. A .(2013).“Wavelet PSO-Based LQR algorithm for optimal structural control using active tuned mass dampers,” Computer-Aided Civil and Infrastructure Engineering, 28, 542-557.
[4] Khaldoon A. Bani-Hani. (2007). “Vibration control of wind-induced response of tall buildings with an active tuned mass damper using neural networks,” Structural Control and Health Monitoring, 14, 83-108.
[5] Chen, P. C., and Chien, K. Y . (2020). “Machine-Learning based optimal seismic control of structure with active mass damper,” Applied Science, 10(5342).
[6] Dyke, S. J., Spencer, B. F., Quast, P., Kaspari, D. C., & Sain, M. K. (1996). “Implementation of an Active Mass Driver Using Acceleration Feedback Control”. Computer-Aided Civil and Infrastructure Engineering, 11(5), 305–323
[7] Chung, L. L., Wu, L. Y., & Jin, T. G. (1998). “Acceleration feedback control of seismic structures”. Engineering Structures, 20(1-2), 62–74.
[8] Sarker, I. H. (2021). “Machine learning: algorithms, real‑world applications and research directions,” SN Computer Science.
[9] Ghaboussi, J., & Joghataie, A. (1995). “Active Control of Structures Using Neural Networks”. Journal of Engineering Mechanics, 121(4), 555–567.
[10] Mnih, V., Kavukcuoglu, K., Silver, D et al. (2013). “Playing atari with deep reinforcement learning,” arXiv preprint, arXiv:1312.5602.
[11] Silver, D., Schrittwieser, J., Simonyan, K. et al. (2017).“Mastering the game of Go without human knowledge,” Nature, 550, 354-359.
[12] Arulkumaran, K., Cully, A., and Togelius, J. (2019), “AlphaStar: an evolutionary computation perspective,” arXiv preprint, arXiv:1902.01724.
[13] Zhang, P., Xiong, L., Yu, Z. et al. (2019).“Reinforcement learning-based end-to-end parking for automatic parking system,” Sensors, 19, 3996.
[14] Haarnoja, T., Ha, S., Zhou, A. et al. (2019).“Learning to Walk via Deep Reinforcement Learning,” arXiv preprint, arXiv:1812.11103.
[15] Eshkevari, S. S., Eshkevari, S. S., Sen, D., Pakzad, S. N.(2023).“Active structural control framework using policy-gradient reinforcement learning,” Engineering Structures, vol. 274, pp.115-122.
[16] Zhang, Y. A., and Zhu, S. (2023).“Novel Model-free Optimal Active Vibration Control Strategy Based on Deep Reinforcement Learning,” Structural Control and Health Monitoring, Article ID 677013.
[17] Yuan, R., Yang, Y., Su, C. et al. (2021) “Research on Vibration Reduction Control Based on Reinforcement Learning,” Advances in Civil Engineering, Article ID 7619214.
[18] Khalatbarisoltani, A., Soleymani, M., and Khodadadi, M. (2019).“Online control of an active seismic system via reinforcement learning,” Structural Control and Health Monitoring.
[19] Watkins, C., and Dayan, P . (1992).“Q-Learning,” Machine Learning, , 8, 279-292.
[20] Lillicrap, T. P., Hunt, J. J., Pritzel, A. et al. (2019).“Continuous control with deep reinforcement learning,” arXiv preprint, arXiv:1509.02971.
[21] Sutton, R. S., and Barto, A. G. (2018).“Reinforcement learning: an introduction (2nd edition),” The MIT Press.
[22] Buşoniu, L., de Bruin, T., Tolić, D., Kober, J., and Palunko, I. (2018) “Reinforcement learning for control: performance, stability, and deep approximators,” Annual Reviews in Control.
[23] Silver, D., Lever, G., Heess, N. et al. (2014). “Deterministic policy gradient algorithms,” In Proceedings of the 31st International Conference on Machine Learning (ICML 14).
[24] Jansen, L. M., and Dyke, S. J. (2020).“Semiactive Control Strategies for MR Dampers: Comparative Study,” Journal of Engineering Mechanics, 128(8), 795–803.
[25] Surjanto, Yohanes K. (2021), “Application of Metaheuristic Optimization for Parameter Design of Building Mass Damper Systems” National Taiwan University of Science and Technology, Master's thesis, Department of Civil and Construction Engineering, supervised by Professor Pei-Ching Chen
[26] Zhou, Che-Wei (2022)，”Numerical Simulation and Experimental Validation of Active Structural Control Algorithm using Direct Excitation Method with Machine Learning,”National Taiwan University of Science and Technology, Master's thesis, Department of Civil and Construction Engineering, supervised by Professor Pei-Ching Chen
[27] Chien,Kai-Yi (2020), “Synthesis and Validation of Machine Learning Based Controllers for Active Structural Control” National Taiwan University of Science and Technology, Master's thesis, Department of Civil and Construction Engineering, supervised by Professor Pei-Ching Chen
[28] Kingma, D. P., and Ba, J. (2014) “Adam: A method for stochastic optimization” arXiv preprint arXiv:1412.6980.
[29] Kayen, R., Brandenberg, S. J., Collins, B. D., et al. (2007). “Investigation of the M6.6 Niigata-Chuetsu Oki, Japan, Earthquake of July 16, 2007” Technical Report 1365. U.S. Department of the Interior and U.S. Geological Survey (USGS).
[30] Aochi, H., Ducellier, A., Dupros, F., et al. (2013). “Finite difference simulations of seismic wave propagation for the 2007 Mw 6.6 Niigata-ken Chuetsu-Oki earthquake: Validity of models and reliable input ground motion in the near field,” Pure and Applied Geophysics, 170(1-2), 43–64. Springer Verlag (Germany).

全文公開日期 2025/08/07 (校內網路)
全文公開日期 2025/08/07 (校外網路)
全文公開日期 2025/08/07 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文