應用深度增強式學習模型於永續物流管理中求解汙染運途問題

簡易檢索 / 詳目顯示

回結果列表

研究生：	Carlos Andres Palacios Caicedo Carlos Andres Palacios Caicedo
論文名稱：	應用深度增強式學習模型於永續物流管理中求解汙染運途問題 A Deep Reinforcement Learning Model for Solving the Pollution Routing Problem in Sustainable Logistics Management
指導教授：	羅士哲 Shih-Che Lo
口試委員:	歐陽超 Chao Ou-Yang 曾世賢 Shih-Hsien Tseng 羅士哲 Shih-Che Lo
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業管理系 Department of Industrial Management
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	59
中文關鍵詞：	深度強化學習、污染運途問題、潤滑油產業、永續物流管理、配銷管道最佳化
外文關鍵詞：	Deep Reinforcement Learning, Pollution-Routing Problem, Lubes Industry, Sustainable Logistics Management, Distribution Channel Optimization
相關次數：	點閱：379 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著地球暖化日益嚴重，環境污染日漸增加，新的法規強制對公司從內部要求管制朝向永續環保。生產者使用的物流運輸是二氧化碳的主要排放者，是減少碳足跡的路徑最佳化的重要因素之一。本論文的主要目標是透過最佳化物流配送路徑問題以減少碳排放，因此，本論文提出了一種使用深度強化學習的方法用來解決污染運途問題新的解決方案。此模型從距離和重量中獲取訓練資訊，最佳化卡車在路途上的油耗以減少二氧化碳排放。本論文所提出的模型從不同實例中訓練與學習，隨後可被呼叫並使用於新的案例而無需重新訓練。此模型是基植於對 Actor 使用注意力機制的 Actor-Critic 模式建構。經過嚴格的實驗設計，對模型的超參數進行了最佳化選取，使本論文所提出的模型能夠求解出近似最佳解的結果。我們將此模型在各種情境下進行測試，以獲得最佳和可行的解決方案。最後，本論文以哥倫比亞的潤滑油產業進行個案研究，以了解實施此類新模型的重要性。本論文所提出的模型可以應用於關注環保議題的物流業。最後，這種模式將允許公司通過日常路線追踪和控制他們的碳排放量，以符合政府法規的要求。

Nowadays, environmental contamination has increased, demanding new regulations and internal controls for companies toward sustainability. The logistics transportation used by the companies is an important producer of CO2, making it an important factor to be optimized for reducing the carbon footprint. The main objective of this thesis is to contribute to reduce carbon emissions by optimizing the routing in the delivery process of the companies. Therefore, a new methodology using Deep Reinforcement Learning (DRL) to solve the Pollution-Routing Problem (PRP) is presented. The model captures the information from distance and weight for optimizing the fuel consumption of the trucks. This model learns from the training with different instances and then can easily be called for inference to new instances without requiring re-training. The model is based on an Actor-Critic method using an attention mechanism for the Actor. The hyperparameters of the algorithm were optimized to acquire near-optimal solutions by rigorous design of experiments. The proposed model is tested under various scenarios to achieve optimal and feasible solutions. Finally, a case study from the Lubes industry in Colombia is introduced to illustrate the importance of the implementation of the proposed model in this thesis. The model can be applied to delivery processes for the logistics industry with environmental concerns. Finally, this model allows companies to track and control their carbon emissions by the daily routing to follow government regulations.

Abstract    i
中文摘要    ii
Acknowledgment    iii
Table of Content    iv
List of Tables    vi
List of Figures    vii
Chapter 1 Introduction    1
1    Research Motivation    1
2    Focus and Scope    3
3    Research Objective    3
4    Research Overview    4
Chapter 2 Literature Review    6
1    DRL Concepts    6
1.1    Reinforcement Learning Structure    6
1.2    Actor-Critic Structure    6
1.3    Attention Mechanisms    7
2    Vehicle Routing Problem    7
2.1    VRP Evolution and Variations    7
2.2    DRL Models Applied to VRP    7
2.3    Metaheuristics Applied to VRP    8
2.4    Classic Heuristics Applied to VRP    9
2.5    Exact Algorithms Applied to VRP    9
3    Pollution Routing Problem    10
3.1    PRP Evolution and Variations    11
3.2    Metaheuristics Applied to PRP    11
3.3    Exact Algorithms Applied to PRP    11
4    Green Vehicle Routing Problem    12
4.1    GVRP Evolution and Variations    12
4.2    Metaheuristics Applied to GVRP    12
4.3    Classic Heuristics Applied to GVRP    13
5    Other Combinatorial Problems    14
5.1    DRL Models Applied to Other Combinatorial Problems    14
5.2    Metaheuristics Applied to Other Combinatorial Problems    15
5.3    Classic Heuristics Applied to Other Combinatorial Problems    15
5.4    Exact Algorithms Applied to Other Combinatorial Problems    15
Chapter 3 Research Methodology    18
1    Mathematical Models    18
1.1    Parameters    18
1.2    Decision Variables    18
1.3    Objective Function    18
1.4    Constraints    18
2    DRL Model Applied    19
Chapter 4 Computational Experiments    24
1    Parameter Tunning    24
2    Training    27
3    Testing    27
Chapter 5 Case Study    32
1    Lubes Industry    32
2    Manufacturing Process    32
3    Distribution Channels    34
4    Company Overview    36
Chapter 6 Conclusions and Future Research    39
Appendix    41
References    46
                                

Allen, S., (2011), Carbon footprint of electricity generation. Post, 383, (Accessed on December 5, 2022, https://www.parliament.uk/globalassets/documents/post/postpn_383-carbon-footprint-electricity-generation.pdf)
Bektaş, T. and Laporte, G. (2011). The Pollution-routing problem. Transportation Research Part B: Methodological, 45(8), 1232-1250. https://doi.org/10.1016/j.trb.2011.02.004
Brahimi, N. and Khan, S.A. (2014). Warehouse location with production, inventory, and distribution decisions: a case study in the lube oil industry. 4OR - A Quarterly Journal of Operations Research, 12(2), 175-197. https://doi.org/10.1007/s10288-013-0237-0
Chen, Z., Zhang, L., Wang, X. and Gu, P. (2022). Optimal design of flexible job shop scheduling under resource preemption based on deep reinforcement learning. Complex System Modeling and Simulation, 2(2), 174-185. https://doi.org/10.23919/csms.2022.0007
Cheng, C., Qi, M., Wang, X. and Zhang, Y. (2016). Multi-period inventory routing problem under carbon emission regulations. International Journal of Production Economics, 182, 263-275. https://doi.org/10.1016/j.ijpe.2016.09.001
da Costa, P., Rhuggenaath, J., Zhang, Y., Akcay, A. and Kaymak, U. (2021). Learning 2-opt heuristics for routing problems via deep reinforcement learning. SN Computer Science. 2, 388. https://doi.org/10.1007/s42979-021-00779-2
Dai, M., Tang, D., Xu, Y. and Li, W. (2014). Energy-aware integrated process planning and scheduling for job shops. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 229, 13-26. https://doi.org/10.1177/0954405414553069
Demir, E., Bektaş, T. and Laporte, G. (2014). The bi-objective pollution-routing problem. European Journal of Operational Research, 232(3), 464-478. https://doi.org/10.1016/j.ejor.2013.08.002
Derigs, U., Pullmann, M. and Vogel, U. (2013). Truck and trailer routing problems, heuristics and computational experience. Computers & Operations Research, 40(2), 536-546. https://doi.org/10.1016/j.cor.2012.08.007
Du, L. and He, R. (2012). Combining nearest neighbor search with tabu search for large-scale vehicle routing problem. Physics Procedia, 25, 1536-1546. https://doi.org/10.1016/j.phpro.2012.03.273
Erdoğan, S. and Miller-Hooks, E. (2012). A Green vehicle routing problem. Transportation Research Part E: Logistics and Transportation Review, 48(1), 100-114. https://doi.org/10.1016/j.tre.2011.08.001
Eshtehadi, R., Fathian, M. and Demir, E. (2017). Robust solutions to the pollution-routing problem with demand and travel time uncertainty. Transportation Research Part D: Transport and Environment, 51, 351-363. https://doi.org/10.1016/j.trd.2017.01.003
Foà, S., Corrado, C., Grani, G. and Palagi, L. (2022). Solving the vehicle routing problem with deep reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.2208.00202
Goel, A. and Gruhn, V. (2008). A General vehicle routing problem. European Journal of Operational Research, 191(3), 650-660. https://doi.org/10.1016/j.ejor.2006.12.065
Grondman, I., Busoniu, L., Lopes, G. A. D. and Babuska, R. (2012). A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6), 1291-1307. https://doi.org/10.1109/tsmcc.2012.2218595
Guo, G. and Xu, Y. (2022). A deep reinforcement learning approach to ride-sharing vehicle dispatching in autonomous mobility-on-demand systems. IEEE Intelligent Transportation Systems Magazine, 14(1), 128-140. https://doi.org/10.1109/mits.2019.2962159
Ho, W., Ho, G. T., Ji, P. and Lau, H. C. (2008). A hybrid genetic algorithm for the multi-depot vehicle routing problem. Engineering Applications of Artificial Intelligence, 21(4), 548-557. https://doi.org/10.1016/j.engappai.2007.06.001
Khandelwal, R. (2014). Marketing channel strategies in rural automotive lubricant market: a comparative study. Novelty Journals, 1(1), ISSN 2394-7322
Kara, İ., Kara, B. Y. and Yetis, M. K. (2007). Energy minimizing vehicle routing problem. Combinatorial Optimization and Applications, 62-71. https://doi.org/10.1007/978-3-540-73556-4_9
Koç, A., Bektaş, T., Jabali, O. and Laporte, G. (2014). The fleet size and mix pollution-routing problem. Transportation Research Part B: Methodological, 70, 239-254. https://doi.org/10.1016/j.trb.2014.09.008
Kramer, R., Maculan, N., Subramanian, A. and Vidal, T. (2015a). A speed and departure time optimization algorithm for the pollution-routing problem. European Journal of Operational Research, 247(3), 782-787. https://doi.org/10.1016/j.ejor.2015.06.037
Kramer, R., Subramanian, A., Vidal, T. and Cabral, L. D. A. F. (2015b). A matheuristic approach for the pollution-routing problem. European Journal of Operational Research, 243(2), 523-539. https://doi.org/10.1016/j.ejor.2014.12.009
Laporte, G. (2009). Fifty years of vehicle routing. Transportation Science, 43(4), 407-548. https://doi.org/10.1287/trsc.1090.0301
Li, C. L. and Vairaktarakis, G. L. (2004). Loading and unloading operations in container terminals. IIE Transactions, 36(4), 287-297. https://doi.org/10.1080/07408170490247340
Li, J., Ma, Y., Gao, R., Cao, Z., Lim, A., Song, W. and Zhang, J. (2022a). Deep reinforcement learning for solving the heterogeneous fleet. IEEE Transactions on Cybernetics, 52(12), 13572-13585. https://doi.org/10.1109/TCYB.2021.3111082
Li, J., Xin, L., Cao, Z., Lim, A., Song, W. and Zhang, J. (2022b). Heterogeneous attentions for solving pickup and delivery problem via deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 23(3), 2306-2315. https://doi.org/10.1109/tits.2021.3056120
Li, L., Tetteh, A. and Xu, Q. (2015). Accessing and distributing petroleum products in e-commerce environment. Journal of Economics, Management and Trade, 10(2), 1-10. https://doi.org/10.9734/bjemt/2015/20211
Li, Y., Soleimani, H. and Zohal, M. (2019). An improved ant colony optimization algorithm for the multi-depot green vehicle routing problem with multiple objectives. Journal of Cleaner Production, 227, 1161-1172. https://doi.org/10.1016/j.jclepro.2019.03.185
Li, Y. (2018). Deep reinforcement learning: an overview. arXiv. https://doi.org/10.48550/arXiv.1701.07274
Lin, S. W., Yu, V. F. and Lu, C. C. (2011). A Simulated annealing heuristic for the truck and trailer routing problem with time windows. Expert Systems with Applications, 38(12), 15244-15252. https://doi.org/10.1016/j.eswa.2011.05.075
Liu, C. L., Chang, C. C. and Tseng, C. J. (2020). Actor-critic deep reinforcement learning for Solving job shop scheduling problems. IEEE Access, 8, 71752-71762. https://doi.org/10.1109/access.2020.2987820
Lo, S. C. (2022). A particle swarm optimization approach to solve the vehicle routing problem with cross-docking and carbon emissions reduction in logistics management. Logistics, 6(3), 62. https://doi.org/10.3390/logistics6030062
Lo, S. C. and Shih, Y. C. (2021). A genetic algorithm with quantum random number generator for solving the pollution-routing problem in sustainable logistics management. Sustainability, 13(15), 8381. https://doi.org/10.3390/su13158381
Low-carbon power (2022). Monitor the transition to low carbon energy. (Accessed on December 5, 2022, https://lowcarbonpower.org)
Masmoudi, O., Yalaoui, A., Ouazene, Y. and Chehade, H. (2017). Lot-sizing in a multi-stage flow line production system with energy consideration. International Journal of Production Research, 55(6), 1640-1663. https://doi.org/10.1080/00207543.2016.1206670
Min, B. and Kim, C. O. (2022). State-dependent parameter tuning of the apparent tardiness cost dispatching rule using deep reinforcement learning. IEEE Access, 10, 20187-20198. https://doi.org/10.1109/access.2022.3152192
Nazari, M., Oroojlooy, A., Takáč, M., and Snyder, L. (2018). Reinforcement learning for solving the vehicle routing problem. Advances in Neural Information Processing Systems, 31, 9861-9871. https://arXiv.org/abs/1802.04240
Niu, Z., Zhong, G. and Yu, H. (2021). A Review on the attention mechanism of deep learning. Neurocomputing, 452, 48-62. https://doi.org/10.1016/j.neucom.2021.03.091
Paeng, B., Park, I. B. and Park, J. (2021). Deep reinforcement learning for minimizing tardiness in parallel machine scheduling with sequence dependent family setups. IEEE Access, 9, 101390-101401. https://doi.org/10.1109/access.2021.3097254
Qiu, Y., Qiao, J. and Pardalos, P. M. (2017). A branch-and-price algorithm for production routing problems with carbon cap-and-trade. Omega, 68, 49-61. https://doi.org/10.1016/j.omega.2016.06.001
Rodoplu, M., Arbaoui, T. and Yalaoui, A. (2018). Energy contract optimization for the single item lot sizing problem in a flow-shop configuration and multiple energy sources. IFAC-PapersOnLine, 51(11), 1089-1094. https://doi.org/10.1016/j.ifacol.2018.08.460
Ronen, D. (1995). Dispatching petroleum products. Operations Research, 43(3), 376-54. https://doi.org/10.1287/opre.43.3.379
Shah, N. K., Li, Z. and Ierapetritou, M. G. (2010). Petroleum refining operations: key issues, advances, and opportunities. Industrial & Engineering Chemistry Research, 50(3), 1161-1170. https://doi.org/10.1021/ie1010004
Shahrabi, J., Adibi, M. A. and Mahootchi, M. (2017). A reinforcement learning approach to parameter estimation in dynamic job shop scheduling. Computers & Industrial Engineering, 110, 75-82. https://doi.org/10.1016/j.cie.2017.05.026
Sultana, N., Chan, J., Sarwar, T. and Qin, A. K. (2021). Learning to optimize routing problems using policy optimization. 2021 International Joint Conference on Neural Networks (IJCNN). 1-8. https://doi.org/10.1109/ijcnn52387.2021.9534010
Sun, P., Li, J., Lan, J., Hu, Y. and Lu, X. (2018). RNN Deep reinforcement learning for routing optimization. 2018 IEEE 4th International Conference on Computer and Communications (ICCC). 285-289. https://doi.org/10.1109/compcomm.2018.8780950
U.S. Energy Information Administration (EIA) (2021). Use of energy for transportation. (Accessed on December 5, 2022, https://www.eia.gov/energyexplained/use-of-energy/transportation.php)
Uzar, M. F. and Çatay, B. (2012). Distribution planning of bulk lubricants at BP Turkey. Omega, 40(6), 870-881. https://doi.org/10.1016/j.omega.2012.01.008
Vianna, D. S., Ochi, L. S. and Drummond, L. M. A. (1999). A parallel hybrid evolutionary metaheuristic for the period vehicle routing problem. Lecture Notes in Computer Science, 1586, 183-191. https://doi.org/10.1007/bfb0097899
Wang, Q. and Tang, C. (2021). Deep reinforcement learning for transportation network combinatorial optimization: a survey. Knowledge-Based Systems, 233, 107526. https://doi.org/10.1016/j.knosys.2021.107526
Widyaningrum, A. and Evy, P. M. (2022). Lubricant distribution process from sales to consumers. Best Journal of Administration and Management, 1(1), 9-15. https://doi.org/10.56403/bejam.v1i1.17
Xiao, Y. and Konak, A. (2015). A simulating annealing algorithm to solve the green vehicle routing and scheduling problem with hierarchical objectives and weighted tardiness. Applied Soft Computing, 34, 372-388. https://doi.org/10.1016/j.asoc.2015.04.054
Xiao, Y. and Konak, A. (2017). A genetic algorithm with exact dynamic programming for the green vehicle routing and scheduling problem. Journal of Cleaner Production, 167, 1450-1463. https://doi.org/10.1016/j.jclepro.2016.11.115
Xiao, Y., Zhao, Q., Kaku, I. and Xu, Y. (2012). Development of a fuel consumption optimization model for the capacitated vehicle routing problem. Computers & Operations Research, 39(7), 1419-1431. https://doi.org/10.1016/j.cor.2011.08.013
Xiao, Y., Zuo, X., Huang, J., Konak, A. and Xu, Y. (2020). The continuous pollution routing problem. Applied Mathematics and Computation, 387, 125072. https://doi.org/10.1016/j.amc.2020.125072
Xin, L., Song, W., Cao, Z. and Zhang, J. (2021). Step-wise deep learning models for solving routing problems. IEEE Transactions on Industrial Informatics, 17(7), 4861-4871. https://doi.org/10.1109/tii.2020.3031409
Xu, Y., Fang, M., Chen, L., Xu, G., Du, Y. and Zhang, C. (2022). Reinforcement learning with multiple relational attention for solving vehicle routing problems. IEEE Transactions on Cybernetics, 52(10), 11107-11120. https://doi.org/10.1109/tcyb.2021.3089179
Yang, S. and Xu, Z. (2021). Intelligent scheduling for permutation flow shop with dynamic job arrival via deep reinforcement learning. 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). 2672-2677. https://doi.org/10.1109/iaeac50856.2021.9390893
Yin, W., Kann, K., Yu, M. and Schütze, H. (2017). Comparative study of CNN and RNN for natural language processing. arXiv. https://doi.org/10.48550/arXiv.1702.01923
Yin, R. and Lu, P. (2022). A cluster-first route-second constructive heuristic method for emergency logistics scheduling in urban transport networks. Sustainability, 14(4), 2301. https://doi.org/10.3390/su14042301
Zeddam, B., Belkaid, F. and Bennekrouf, M. (2020). Solving the production-routing problem with energy consideration: a case study from the furniture industry. 2020 IEEE 13th International Colloquium of Logistics and Supply Chain Management (LOGISTIQUA), 1-6 https://doi.org/10.1109/logistiqua49782.2020.9353903
Zhang, J., Zhao, Y., Xue, W. and Li, J. (2015). Vehicle routing problem with fuel consumption and carbon emission. International Journal of Production Economics, 170 Part A, 234-242. https://doi.org/10.1016/j.ijpe.2015.09.031
Zhao, J., Mao, M., Zhao, X. and Zou, J. (2021a). A hybrid of deep reinforcement learning and local search for the vehicle routing problems. IEEE Transactions on Intelligent Transportation Systems, 22(11), 7208-7218. https://doi.org/10.1109/tits.2020.3003163
Zhao, Y., Wang, Y., Tan, Y., Zhang, J. and Yu, H. (2021b). Dynamic jobshop scheduling algorithm based on deep Q network. IEEE Access, 9, 122995-123011. https://doi.org/10.1109/access.2021.3110242

全文公開日期 2026/02/08 (校內網路)
全文公開日期 2026/02/08 (校外網路)
全文公開日期 2026/02/08 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文