簡易檢索 / 詳目顯示

研究生: 吳宜真
Yi-Chen Wu
論文名稱: 採用深度強化學習結合機器學習預測模型及主幹道影響之自適性交通信號系統設計
Design of an Adaptive Traffic Signal System Using Deep Reinforcement Learning Incorporated the Machine Learning Forecasting Model and the Impact of Arterial Roads
指導教授: 馮輝文
Huei-Wen Ferng
口試委員: 黃琴雅
謝宏昀
吳中實
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 60
中文關鍵詞: 交通信號燈控制強化學習交通預測隨機森林迴歸模型主幹道之影響
外文關鍵詞: Traffic Light Control, Reinforcement Learning, Traffic Forecasting, Random Forest Regression Model, Impact of Arterial Roads
相關次數: 點閱:185下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

由於傳統演算法無法有效地適應交通路網的動態變化,因此,在解決交通信號控制 (Traffic Light Control, TLC) 問題上,強化學習 (Reinforcement Learning, RL) 引起了很大的關注。然而,現有基於強化學習的方法中,皆只根據當前道路上的交通狀況去進行交通調節(Traffic Regulation),雖然成功,有效地解決了現有的交通壅塞(Traffic Congestion),卻因為不能提前利用未來資訊,而無法有效地避免交通壅塞的產生。因此,在本碩論中,將強化學習結合了交通預測(Forecasting)技術,採用機器學習 (Machine Learning, ML) 經典的隨機森林迴歸模型 (Random Forest Regression),同時使用本論文發想的交通特徵(Feature),讓模型可以更有效地進行短期交通十字路口未來車流量的預測。並將交通預測的結果用於交通信號燈綠燈相位時長(Phase Duration)上面,由代理人 (Agent) 將預測的結果與當前觀察到的交通狀況相結合,以更有效地動態控制交通信號燈的相位以及綠燈持續時間。除此之外,我們還考量了主幹道對於整體交通的影響,將其運用在強化學習獎勵函數 (Reward) 上,幫助反映代理人的動作選取是否更符合環境上的需求。而透過模擬結果顯示,我們所提之系統設計能叫文獻上之相近設計優越。


Due to the fact that traditional algorithms cannot effectively adapt to the dynamic changes of the traffic network, reinforcement learning has attracted great attention in solving traffic signal control problems. However, traffic adjustment is only performed in the existing based reinforcement learning methods according to the current traffic conditions on the road. Although the existing traffic congestion can be successfully solved,the occurrence of traffic congestion cannot be effectively avoided because future information cannot be utilized in advance.
Therefore, this theis targets to combine reinforcement learning with the traffic prediction technique. Using the classic random forest regression model of machine learning and using the traffic characteristics developed in this theis, our proposed model can more effectively predict the short-term future traffic flow at traffic intersections.The obtained traffic prediction results are then applied to the green phase duration of traffic lights by allowing the agent to integrate the predicted results with the current traffic observation to more effectively and dynamically control the phase of traffic lights and the duration of the green light. In addition, we also considered the impact of arterial roads applied to the reinforcement learning reward function to help reflect whether the agent's action selection fits the needs of the environment. Finally,our simulation results demostrate that our proposed adaptive traffic signal system can outperform the closely related systems in the literature.

論文指導教授推薦書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 考試委員審定書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 中文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii 英文摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv 誌謝 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v 目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi 表目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 圖目錄 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 第一章、緒論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 研究背景 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1.1 交通信號燈控制 (Traffic Light Control, TLC) . . . . . . . . . . 2 1.1.2 單路口和多路口信號燈控制 . . . . . . . . . . . . . . . . . . . 2 1.1.3 主幹道對交通的重要影響 . . . . . . . . . . . . . . . . . . . . 3 1.2 強化學習在交通信號上的影響 . . . . . . . . . . . . . . . . . . . . . . 3 1.3 交通預測對於交通信號控制的需要 . . . . . . . . . . . . . . . . . . . 4 1.4 研究動機 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 論文組織 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 第二章、相關文獻探討 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 智能運輸系統 (Intelligent Transportation System, ITS) . . . . . . . . . 6 2.1.1 車載網路 (Vehicular Ad Hoc Networks, VANETs) . . . . . . . . 7 2.1.2 交通車道改道 (Traffic Rerouting) . . . . . . . . . . . . . . . . 7 2.1.3 交通信號燈控制 (Traffic Light Control, TLC) . . . . . . . . . . 8 2.2 交通預測 (Traffic Prediction) . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 注意力機制 (Attention mechanism) . . . . . . . . . . . . . . . . 9 2.2.2 隨機森林迴歸演算法 (Random Forest Regression) . . . . . . . . 10 2.3 強化學習 (Reinforcement Learning, RL) . . . . . . . . . . . . . . . . . 11 2.3.1 強化學習的重要名詞解釋以及運作流程 . . . . . . . . . . . . 12 2.3.2 強化學習的相關研究 . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.3 強化學習在交通信號領域的相關應用 . . . . . . . . . . . . . . 14 2.3.4 比較對象 Efficient Colight 以及 Presslight 介紹 . . . . . . . . . 15 第三章、方法設計與流程 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.1 問題描述及十字路口的設計 . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 方法設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 交通預測設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.2 強化學習設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.3 系統架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 第四章、模擬結果數值討論與分析 . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1 模擬環境與參數設定 . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 使用之資料集和比較對像介紹 . . . . . . . . . . . . . . . . . . . . . . 35 4.3 模擬比較與討論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3.1 真實世界車輛數據集和隨機車流合成數據集的主幹道貢獻分析 37 4.3.2 不同車輛數量之比較分析 . . . . . . . . . . . . . . . . . . . . 40 第五章、結論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 參考文獻 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

[1] Z. Guanjie, X. Yuanhao, Z. Xinshi, F. Jie, W. Hua, Z. Huichu, L. Yong, X. Kai, and L. Zhenhui, “Learning phase competition for traffic signal control,” in Proceedings of the 28th ACM international conference on information and knowledge management, pp. 1963–1972, 2019.
[2] Z. Cao, S. Jiang, J. Zhang, and H. Guo, “A Unified Framework for Vehicle Rerouting and Traffic Light Control to Reduce Traffic Congestion,” IEEE Trans. Veh. Technol., vol. 18, no. 7, pp. 1958–1973, Jul. 2016.
[3] S. Hossan and N. Nower, “Fog-Based Dynamic Traffic Light Control System for Improving Public Transport,” Public Transp., vol. 12, no. 2, pp. 431–454, Apr. 2020.
[4] T. Wu, P. Zhou, K. Liu, Y. Yuan, X. Wang, H. Huang, and D. O. Wu, “MultiAgent Deep Reinforcement Learning for Urban Traffic Light Control in Vehicular Networks,” IEEE Trans. Veh. Technol., vol. 69, no. 8, pp. 8243–8256, Aug. 2020.
[5] S. Lee, M. Younis, A. Murali, and L. Meejeong, “Dynamic Local Vehicular Flow Optimization Using Real-Time Traffic Conditions at Multiple Road Intersections,” IEEE Access, vol. 7, pp. 28137–28157, Mar. 2019.
[6] H. Wei, G. Zheng, H. Yao, and Z. Li, “Intellilight: A Reinforcement Learning Approach for Intelligent Traffic Light Control,” in Proc. of Conf. on Knowledge Discovery & Data Mining, pp. 2496–2505, Aug. 2018.
[7] W. Huan, L. Ruigang, W. Peng, L. Guanghua, W. Hao, and Y. Liping, “Intelligent Optimization of Dynamic Traffic Light Control via Diverse Optimization Priorities,” Int. J. Intell. Syst., vol. 36, no. 11, pp. 6748–6762, Jun. 2021.
[8] X. Liang, X. Du, G. Wang, and Z. Han, “A Deep Reinforcement Learning Network for Traffic Light Cycle Control,” IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 1243– 1253, Aug. 2019.
[9] X. Hu, C. Zhao, and G. Wang, “A Traffic Light Dynamic Control Algorithm with Deep Reinforcement Learning Based on GNN Prediction,” arXiv preprint arXiv: 2009.14627, Sep. 2020.
[10] “中華智慧運輸協會,” 中華智慧運輸協會-關於協會甚麼是 ITS.”.” https: //www.its-taiwan.org.tw/ch/a-2.asp. Accessed on: Aug.24 2020.
[11] N. M. Attila and S. Vilmos, “Traffic Congestion Propagation Identification Method in Smart Cities,” Infocomm., vol. 13, no. 1, pp. 45–57, Mar. 2021.
[12] Y. Yang, R. Luo, M. Li, M. Zhou, W. Zhang, and J. Wang, “Mean Field Multi-Agent Reinforcement Learning,” in Proc. Int. Conf. Machine Learning (ICML), pp. 5571– 5580, PMLR, Jul. 2018.
[13] K. Pandit, D. Ghosal, H. M. Zhang, and C.-N. Chuah, “Adaptive Traffic Signal Control with Vehicular ad hoc networks,” IEEE Trans. Veh. Technol., vol. 62, no. 4, pp. 1459–1471, May 2013.
[14] W. Shen, D. Soufiene, Z. Zonghua, and M. Jennifer, “Next Road Rerouting: A Multiagent System for Mitigating Unexpected Urban Traffic Congestion,” IEEE Trans. Intell. Transport., vol. 17, no. 10, pp. 2888–2899, Oct. 2016.
[15] Y. T. Tseng and H. W. Ferng, “An Improved Traffic Rerouting Strategy Using RealTime Traffic Information and Decisive Weights,” IEEE Trans. Veh. Technol., vol. 70, no. 10, pp. 9741–9751, Oct. 2021.
[16] P. Perez-Murueta, A. Gómez-Espinosa, C. Cardenas, and M. Gonzalez-Mendoza, “Deep Learning System for Vehicular Re-Routing and Congestion Avoidance,” Appl. Sci., vol. 9, no. 13, p. 2717, Jul. 2019.
[17] J. Pan, I. S. Popa, K. Zeitouni, and C. Borcea, “Proactive vehicular traffic rerouting for lower travel time,” IEEE Transactions on vehicular technology, vol. 62, no. 8, pp. 3551–3568, 2013.
[18] B. C. C. Meng, N. S. Damanhuri, and N. A. Othman, “Smart Traffic Light Control System Using Image Processing,” in Proc. IOP Conf. Ser.: Mater. Sci. Eng. C, p. 012021, Sep. 2021.
[19] D. F. Borges, A. D. de Souza, E. M. Moreira, and J. P. R. Leite, “Traffic Light Control and Machine Learning: A Systematic Mapping Review,” in Proc. Int. Conf. Inf. Technol.-New Gener. Cham, pp. 11–17, Feb. 2021.
[20] M. Wiering, J. Veenen, J. Vreeken, and A. Koopman, “Intelligent Traffic Light Control,” Jul. 2004.
[21] J. Iša, J. Kooij, R. Koppejan, and L. Kuijer, “Reinforcement Learning of Traffic Light Controllers Adapting to Accidents,” Des. Org. Auto. Syst., pp. 1–14, Feb. 2006.
[22] A. M. Nagy and V. Simon, “Survey on traffic prediction in smart cities,” Pervasive and Mobile Computing, vol. 50, pp. 148–163, Oct. 2018.
[23] Y. Wang, D. Zhang, Y. Liu, B. Dai, and L. H. Lee, “Enhancing transportation systems via deep learning: A survey,” Transp. Res. C, Emerg. Technol., vol. 99, pp. 144–163, Feb. 2019.
[24] C. Zhang, J. James, and Y. Liu, “Spatial-temporal graph attention networks: A deep learning approach for traffic forecasting,” IEEE Access, vol. 7, pp. 166246–166256, 2019.
[25] D. Hu, “An introductory survey on attention mechanisms in NLP problems,” in Proceedings of SAI Intelligent Systems Conference, pp. 432–448, Springer, 2019.
[26] A. S. Ami, M. Humaira, M. A. R. K. Jim, S. Paul, and F. M. Shah, “Bengali image captioning with visual attention,” in 2020 23rd International Conference on Computer and Information Technology (ICCIT), pp. 1–5, IEEE, 2020.
[27] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention,” in International conference on machine learning, pp. 2048–2057, PMLR, 2015.
[28] Y. Liang, S. Ke, J. Zhang, X. Yi, and Y. Zheng, “Geoman: Multi-level attention networks for geo-sensory time series prediction.,” in IJCAI, vol. 2018, pp. 3428– 3434, 2018.
[29] “” 監 督 式 學 習: 「分 類」 和 「迴 歸」 的 介 紹 與 比 較 – 機 器 學 習 兩 大 學 習 方 法 (一)”.” https://ikala.cloud/supervised-learning-classification-regression/. Accessed on: Nov.09 2021.
[30] “” 什 麼 是 隨 機 森 林?”.” https://www.tibco.com/zh-hant/ reference-center/what-is-a-random-forest.
[31] S. David, H. Aja, M. J. Chris, G. Arthur, S. Laurent, V. D. D. George, S. Julian, A. Ioannis, P. Veda, L. Marc, et al., “Mastering The Game of Go with Deep Neural Networks and Tree Search,” Nature, vol. 529, no. 7587, pp. 484–489, Jan. 2016.
[32] S. David, S. Julian, S. Karen, A. Ioannis, H. Aja, G. Arthur, H. Thomas, B. Lucas, L. Matthew, B. Adrian, et al., “Mastering The Game of Go without Human Knowledge,” Nature, vol. 550, no. 7676, pp. 354–359, Oct. 2017.
[33] C. H. Robert, B. G. Andrew, et al., “Improving Elevator Performance Using Reinforcement Learning,” ANIPS, pp. 1017–1023, 1996.
[34] Z. Junjie, Y. Minghao, G. Zehua, Y. C. Yu, and C. H. Jonathan, “CFR-RL: Traffic Engineering with Reinforcement Learning in SDN,” IEEE J. Sel. Areas Commun., vol. 38, no. 10, pp. 2249–2259, Oct. 2020.
[35] J. Liu, S. Qin, Y. Luo, Y. Wang, and S. Yang, “Intelligent traffic light control by exploring strategies in an optimised space of deep q-learning,” IEEE Transactions on Vehicular Technology, vol. 71, no. 6, pp. 5960–5970, 2022.
[36] M. Shamsi, A. Rasouli Kenari, and R. Aghamohammadi, “Reinforcement learning for traffic light control with emphasis on emergency vehicles,” The Journal of Supercomputing, vol. 78, no. 4, pp. 4911–4937, 2022.
[37] I. Tomar, I. Sreedevi, and N. Pandey, “State-of-art review of traffic light synchronization for intelligent vehicles: current status, challenges, and emerging trends,” Electronics, vol. 11, no. 3, p. 465, 2022.
[38] W. C. JCH and D. Peter, “Q-learning,” Machine learning, vol. 8, no. 3-4, pp. 279– 292, May 1992.
[39] D. Frederik, B. Thomas, L. M. Truong, and K. Alois, “Graph Neural Networks for Modelling Traffic Participant Interaction,” in Proc. IEEE Intell. Vehicles Symp., pp. 695–701, Aug. 2019.
[40] H. Wei, C. Chen, G. Zheng, K. Wu, V. Gayah, K. Xu, and Z. Li, “Presslight: Learning max pressure control to coordinate traffic signals in arterial network,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1290–1298, Aug. 2019.
[41] H. Wei, G. Zheng, V. Gayah, and Z. Li, “A survey on traffic signal control methods,” arXiv preprint arXiv:1904.08117, 2019.
[42] P. Varaiya, “Max pressure control of a network of signalized intersections,” Transportation Research Part C: Emerging Technologies, vol. 36, pp. 177–195, 2013.
[43] Q. Wu, L. Zhang, J. Shen, L. Lü, B. Du, and w. J. Wu, “Efficient Pressure: Improving efficiency for signalized intersections,” arXiv preprint arXiv:2112.02336, Dec. 2021.
[44] C. Chen, H. Wei, N. Xu, G. Zheng, M. Yang, Y. Xiong, K. Xu, and Z. Li, “Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 3414–3421, Apr. 2020.
[45] H. Zhang, S. Feng, C. Liu, Y. Ding, Y. Zhu, Z. Zhou, W. Zhang, Y. Yu, H. Jin, and Z. Li, “Cityflow: A multi-agent reinforcement learning environment for large scale city traffic scenario,” in The world wide web conference, pp. 3620–3624, May 2019.
[46] H. Zhang, M. Kafouros, and Y. Yu, “Planlight: Learning to Optimize Traffic Signal Control with Planning and Iterative Policy Improvement,” IEEE Access, vol. 8, pp. 219244–219255, Dec. 2020.
[47] H. Wei, N. Xu, H. Zhang, G. Zheng, X. Zang, C. Chen, W. Zhang, Y. Zhu, K. Xu, and Z. Li, “Colight: Learning network-level cooperation for traffic signal control,” in Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp. 1913–1922, Nov. 2019.
[48] “scikit-learn 官網.” https://scikit-learn.org/stable/.
[49] “”sklearn-randomforest 隨機森林參數及實例”.” https://www.twblogs.net/a/ 5d3f3ebbbd9eee51fbf903a0. Accessed on: Jul.30 2019.

無法下載圖示 全文公開日期 2026/02/15 (校內網路)
全文公開日期 2026/02/15 (校外網路)
全文公開日期 2026/02/15 (國家圖書館:臺灣博碩士論文系統)
QR CODE