簡易檢索 / 詳目顯示

研究生: Carlos Andres Palacios Caicedo
Carlos Andres Palacios Caicedo
論文名稱: 應用深度增強式學習模型於永續物流管理中求解汙染運途問題
A Deep Reinforcement Learning Model for Solving the Pollution Routing Problem in Sustainable Logistics Management
指導教授: 羅士哲
Shih-Che Lo
口試委員: 歐陽超
Chao Ou-Yang
曾世賢
Shih-Hsien Tseng
羅士哲
Shih-Che Lo
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 59
中文關鍵詞: 深度強化學習污染運途問題潤滑油產業永續物流管理配銷管道最佳化
外文關鍵詞: Deep Reinforcement Learning, Pollution-Routing Problem, Lubes Industry, Sustainable Logistics Management, Distribution Channel Optimization
相關次數: 點閱:379下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著地球暖化日益嚴重,環境污染日漸增加,新的法規強制對公司從內部要求管制朝向永續環保。生產者使用的物流運輸是二氧化碳的主要排放者,是減少碳足跡的路徑最佳化的重要因素之一。本論文的主要目標是透過最佳化物流配送路徑問題以減少碳排放,因此,本論文提出了一種使用深度強化學習的方法用來解決污染運途問題新的解決方案。此模型從距離和重量中獲取訓練資訊,最佳化卡車在路途上的油耗以減少二氧化碳排放。本論文所提出的模型從不同實例中訓練與學習,隨後可被呼叫並使用於新的案例而無需重新訓練。此模型是基植於對 Actor 使用注意力機制的 Actor-Critic 模式建構。經過嚴格的實驗設計,對模型的超參數進行了最佳化選取,使本論文所提出的模型能夠求解出近似最佳解的結果。我們將此模型在各種情境下進行測試,以獲得最佳和可行的解決方案。最後,本論文以哥倫比亞的潤滑油產業進行個案研究,以了解實施此類新模型的重要性。本論文所提出的模型可以應用於關注環保議題的物流業。最後,這種模式將允許公司通過日常路線追踪和控制他們的碳排放量,以符合政府法規的要求。


    Nowadays, environmental contamination has increased, demanding new regulations and internal controls for companies toward sustainability. The logistics transportation used by the companies is an important producer of CO2, making it an important factor to be optimized for reducing the carbon footprint. The main objective of this thesis is to contribute to reduce carbon emissions by optimizing the routing in the delivery process of the companies. Therefore, a new methodology using Deep Reinforcement Learning (DRL) to solve the Pollution-Routing Problem (PRP) is presented. The model captures the information from distance and weight for optimizing the fuel consumption of the trucks. This model learns from the training with different instances and then can easily be called for inference to new instances without requiring re-training. The model is based on an Actor-Critic method using an attention mechanism for the Actor. The hyperparameters of the algorithm were optimized to acquire near-optimal solutions by rigorous design of experiments. The proposed model is tested under various scenarios to achieve optimal and feasible solutions. Finally, a case study from the Lubes industry in Colombia is introduced to illustrate the importance of the implementation of the proposed model in this thesis. The model can be applied to delivery processes for the logistics industry with environmental concerns. Finally, this model allows companies to track and control their carbon emissions by the daily routing to follow government regulations.

    Abstract i 中文摘要 ii Acknowledgment iii Table of Content iv List of Tables vi List of Figures vii Chapter 1 Introduction 1 1.1 Research Motivation 1 1.2 Focus and Scope 3 1.3 Research Objective 3 1.4 Research Overview 4 Chapter 2 Literature Review 6 2.1 DRL Concepts 6 2.1.1 Reinforcement Learning Structure 6 2.1.2 Actor-Critic Structure 6 2.1.3 Attention Mechanisms 7 2.2 Vehicle Routing Problem 7 2.2.1 VRP Evolution and Variations 7 2.2.2 DRL Models Applied to VRP 7 2.2.3 Metaheuristics Applied to VRP 8 2.2.4 Classic Heuristics Applied to VRP 9 2.2.5 Exact Algorithms Applied to VRP 9 2.3 Pollution Routing Problem 10 2.3.1 PRP Evolution and Variations 11 2.3.2 Metaheuristics Applied to PRP 11 2.3.3 Exact Algorithms Applied to PRP 11 2.4 Green Vehicle Routing Problem 12 2.4.1 GVRP Evolution and Variations 12 2.4.2 Metaheuristics Applied to GVRP 12 2.4.3 Classic Heuristics Applied to GVRP 13 2.5 Other Combinatorial Problems 14 2.5.1 DRL Models Applied to Other Combinatorial Problems 14 2.5.2 Metaheuristics Applied to Other Combinatorial Problems 15 2.5.3 Classic Heuristics Applied to Other Combinatorial Problems 15 2.5.4 Exact Algorithms Applied to Other Combinatorial Problems 15 Chapter 3 Research Methodology 18 3.1 Mathematical Models 18 3.1.1 Parameters 18 3.1.2 Decision Variables 18 3.1.3 Objective Function 18 3.1.4 Constraints 18 3.2 DRL Model Applied 19 Chapter 4 Computational Experiments 24 4.1 Parameter Tunning 24 4.2 Training 27 4.3 Testing 27 Chapter 5 Case Study 32 5.1 Lubes Industry 32 5.2 Manufacturing Process 32 5.3 Distribution Channels 34 5.4 Company Overview 36 Chapter 6 Conclusions and Future Research 39 Appendix 41 References 46

    Allen, S., (2011), Carbon footprint of electricity generation. Post, 383, (Accessed on December 5, 2022, https://www.parliament.uk/globalassets/documents/post/postpn_383-carbon-footprint-electricity-generation.pdf)
    Bektaş, T. and Laporte, G. (2011). The Pollution-routing problem. Transportation Research Part B: Methodological, 45(8), 1232-1250. https://doi.org/10.1016/j.trb.2011.02.004
    Brahimi, N. and Khan, S.A. (2014). Warehouse location with production, inventory, and distribution decisions: a case study in the lube oil industry. 4OR - A Quarterly Journal of Operations Research, 12(2), 175-197. https://doi.org/10.1007/s10288-013-0237-0
    Chen, Z., Zhang, L., Wang, X. and Gu, P. (2022). Optimal design of flexible job shop scheduling under resource preemption based on deep reinforcement learning. Complex System Modeling and Simulation, 2(2), 174-185. https://doi.org/10.23919/csms.2022.0007
    Cheng, C., Qi, M., Wang, X. and Zhang, Y. (2016). Multi-period inventory routing problem under carbon emission regulations. International Journal of Production Economics, 182, 263-275. https://doi.org/10.1016/j.ijpe.2016.09.001
    da Costa, P., Rhuggenaath, J., Zhang, Y., Akcay, A. and Kaymak, U. (2021). Learning 2-opt heuristics for routing problems via deep reinforcement learning. SN Computer Science. 2, 388. https://doi.org/10.1007/s42979-021-00779-2
    Dai, M., Tang, D., Xu, Y. and Li, W. (2014). Energy-aware integrated process planning and scheduling for job shops. Proceedings of the Institution of Mechanical Engineers, Part B: Journal of Engineering Manufacture, 229, 13-26. https://doi.org/10.1177/0954405414553069
    Demir, E., Bektaş, T. and Laporte, G. (2014). The bi-objective pollution-routing problem. European Journal of Operational Research, 232(3), 464-478. https://doi.org/10.1016/j.ejor.2013.08.002
    Derigs, U., Pullmann, M. and Vogel, U. (2013). Truck and trailer routing problems, heuristics and computational experience. Computers & Operations Research, 40(2), 536-546. https://doi.org/10.1016/j.cor.2012.08.007
    Du, L. and He, R. (2012). Combining nearest neighbor search with tabu search for large-scale vehicle routing problem. Physics Procedia, 25, 1536-1546. https://doi.org/10.1016/j.phpro.2012.03.273
    Erdoğan, S. and Miller-Hooks, E. (2012). A Green vehicle routing problem. Transportation Research Part E: Logistics and Transportation Review, 48(1), 100-114. https://doi.org/10.1016/j.tre.2011.08.001
    Eshtehadi, R., Fathian, M. and Demir, E. (2017). Robust solutions to the pollution-routing problem with demand and travel time uncertainty. Transportation Research Part D: Transport and Environment, 51, 351-363. https://doi.org/10.1016/j.trd.2017.01.003
    Foà, S., Corrado, C., Grani, G. and Palagi, L. (2022). Solving the vehicle routing problem with deep reinforcement learning. arXiv. https://doi.org/10.48550/arXiv.2208.00202
    Goel, A. and Gruhn, V. (2008). A General vehicle routing problem. European Journal of Operational Research, 191(3), 650-660. https://doi.org/10.1016/j.ejor.2006.12.065
    Grondman, I., Busoniu, L., Lopes, G. A. D. and Babuska, R. (2012). A survey of actor-critic reinforcement learning: standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6), 1291-1307. https://doi.org/10.1109/tsmcc.2012.2218595
    Guo, G. and Xu, Y. (2022). A deep reinforcement learning approach to ride-sharing vehicle dispatching in autonomous mobility-on-demand systems. IEEE Intelligent Transportation Systems Magazine, 14(1), 128-140. https://doi.org/10.1109/mits.2019.2962159
    Ho, W., Ho, G. T., Ji, P. and Lau, H. C. (2008). A hybrid genetic algorithm for the multi-depot vehicle routing problem. Engineering Applications of Artificial Intelligence, 21(4), 548-557. https://doi.org/10.1016/j.engappai.2007.06.001
    Khandelwal, R. (2014). Marketing channel strategies in rural automotive lubricant market: a comparative study. Novelty Journals, 1(1), ISSN 2394-7322
    Kara, İ., Kara, B. Y. and Yetis, M. K. (2007). Energy minimizing vehicle routing problem. Combinatorial Optimization and Applications, 62-71. https://doi.org/10.1007/978-3-540-73556-4_9
    Koç, A., Bektaş, T., Jabali, O. and Laporte, G. (2014). The fleet size and mix pollution-routing problem. Transportation Research Part B: Methodological, 70, 239-254. https://doi.org/10.1016/j.trb.2014.09.008
    Kramer, R., Maculan, N., Subramanian, A. and Vidal, T. (2015a). A speed and departure time optimization algorithm for the pollution-routing problem. European Journal of Operational Research, 247(3), 782-787. https://doi.org/10.1016/j.ejor.2015.06.037
    Kramer, R., Subramanian, A., Vidal, T. and Cabral, L. D. A. F. (2015b). A matheuristic approach for the pollution-routing problem. European Journal of Operational Research, 243(2), 523-539. https://doi.org/10.1016/j.ejor.2014.12.009
    Laporte, G. (2009). Fifty years of vehicle routing. Transportation Science, 43(4), 407-548. https://doi.org/10.1287/trsc.1090.0301
    Li, C. L. and Vairaktarakis, G. L. (2004). Loading and unloading operations in container terminals. IIE Transactions, 36(4), 287-297. https://doi.org/10.1080/07408170490247340
    Li, J., Ma, Y., Gao, R., Cao, Z., Lim, A., Song, W. and Zhang, J. (2022a). Deep reinforcement learning for solving the heterogeneous fleet. IEEE Transactions on Cybernetics, 52(12), 13572-13585. https://doi.org/10.1109/TCYB.2021.3111082
    Li, J., Xin, L., Cao, Z., Lim, A., Song, W. and Zhang, J. (2022b). Heterogeneous attentions for solving pickup and delivery problem via deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 23(3), 2306-2315. https://doi.org/10.1109/tits.2021.3056120
    Li, L., Tetteh, A. and Xu, Q. (2015). Accessing and distributing petroleum products in e-commerce environment. Journal of Economics, Management and Trade, 10(2), 1-10. https://doi.org/10.9734/bjemt/2015/20211
    Li, Y., Soleimani, H. and Zohal, M. (2019). An improved ant colony optimization algorithm for the multi-depot green vehicle routing problem with multiple objectives. Journal of Cleaner Production, 227, 1161-1172. https://doi.org/10.1016/j.jclepro.2019.03.185
    Li, Y. (2018). Deep reinforcement learning: an overview. arXiv. https://doi.org/10.48550/arXiv.1701.07274
    Lin, S. W., Yu, V. F. and Lu, C. C. (2011). A Simulated annealing heuristic for the truck and trailer routing problem with time windows. Expert Systems with Applications, 38(12), 15244-15252. https://doi.org/10.1016/j.eswa.2011.05.075
    Liu, C. L., Chang, C. C. and Tseng, C. J. (2020). Actor-critic deep reinforcement learning for Solving job shop scheduling problems. IEEE Access, 8, 71752-71762. https://doi.org/10.1109/access.2020.2987820
    Lo, S. C. (2022). A particle swarm optimization approach to solve the vehicle routing problem with cross-docking and carbon emissions reduction in logistics management. Logistics, 6(3), 62. https://doi.org/10.3390/logistics6030062
    Lo, S. C. and Shih, Y. C. (2021). A genetic algorithm with quantum random number generator for solving the pollution-routing problem in sustainable logistics management. Sustainability, 13(15), 8381. https://doi.org/10.3390/su13158381
    Low-carbon power (2022). Monitor the transition to low carbon energy. (Accessed on December 5, 2022, https://lowcarbonpower.org)
    Masmoudi, O., Yalaoui, A., Ouazene, Y. and Chehade, H. (2017). Lot-sizing in a multi-stage flow line production system with energy consideration. International Journal of Production Research, 55(6), 1640-1663. https://doi.org/10.1080/00207543.2016.1206670
    Min, B. and Kim, C. O. (2022). State-dependent parameter tuning of the apparent tardiness cost dispatching rule using deep reinforcement learning. IEEE Access, 10, 20187-20198. https://doi.org/10.1109/access.2022.3152192
    Nazari, M., Oroojlooy, A., Takáč, M., and Snyder, L. (2018). Reinforcement learning for solving the vehicle routing problem. Advances in Neural Information Processing Systems, 31, 9861-9871. https://arXiv.org/abs/1802.04240
    Niu, Z., Zhong, G. and Yu, H. (2021). A Review on the attention mechanism of deep learning. Neurocomputing, 452, 48-62. https://doi.org/10.1016/j.neucom.2021.03.091
    Paeng, B., Park, I. B. and Park, J. (2021). Deep reinforcement learning for minimizing tardiness in parallel machine scheduling with sequence dependent family setups. IEEE Access, 9, 101390-101401. https://doi.org/10.1109/access.2021.3097254
    Qiu, Y., Qiao, J. and Pardalos, P. M. (2017). A branch-and-price algorithm for production routing problems with carbon cap-and-trade. Omega, 68, 49-61. https://doi.org/10.1016/j.omega.2016.06.001
    Rodoplu, M., Arbaoui, T. and Yalaoui, A. (2018). Energy contract optimization for the single item lot sizing problem in a flow-shop configuration and multiple energy sources. IFAC-PapersOnLine, 51(11), 1089-1094. https://doi.org/10.1016/j.ifacol.2018.08.460
    Ronen, D. (1995). Dispatching petroleum products. Operations Research, 43(3), 376-54. https://doi.org/10.1287/opre.43.3.379
    Shah, N. K., Li, Z. and Ierapetritou, M. G. (2010). Petroleum refining operations: key issues, advances, and opportunities. Industrial & Engineering Chemistry Research, 50(3), 1161-1170. https://doi.org/10.1021/ie1010004
    Shahrabi, J., Adibi, M. A. and Mahootchi, M. (2017). A reinforcement learning approach to parameter estimation in dynamic job shop scheduling. Computers & Industrial Engineering, 110, 75-82. https://doi.org/10.1016/j.cie.2017.05.026
    Sultana, N., Chan, J., Sarwar, T. and Qin, A. K. (2021). Learning to optimize routing problems using policy optimization. 2021 International Joint Conference on Neural Networks (IJCNN). 1-8. https://doi.org/10.1109/ijcnn52387.2021.9534010
    Sun, P., Li, J., Lan, J., Hu, Y. and Lu, X. (2018). RNN Deep reinforcement learning for routing optimization. 2018 IEEE 4th International Conference on Computer and Communications (ICCC). 285-289. https://doi.org/10.1109/compcomm.2018.8780950
    U.S. Energy Information Administration (EIA) (2021). Use of energy for transportation. (Accessed on December 5, 2022, https://www.eia.gov/energyexplained/use-of-energy/transportation.php)
    Uzar, M. F. and Çatay, B. (2012). Distribution planning of bulk lubricants at BP Turkey. Omega, 40(6), 870-881. https://doi.org/10.1016/j.omega.2012.01.008
    Vianna, D. S., Ochi, L. S. and Drummond, L. M. A. (1999). A parallel hybrid evolutionary metaheuristic for the period vehicle routing problem. Lecture Notes in Computer Science, 1586, 183-191. https://doi.org/10.1007/bfb0097899
    Wang, Q. and Tang, C. (2021). Deep reinforcement learning for transportation network combinatorial optimization: a survey. Knowledge-Based Systems, 233, 107526. https://doi.org/10.1016/j.knosys.2021.107526
    Widyaningrum, A. and Evy, P. M. (2022). Lubricant distribution process from sales to consumers. Best Journal of Administration and Management, 1(1), 9-15. https://doi.org/10.56403/bejam.v1i1.17
    Xiao, Y. and Konak, A. (2015). A simulating annealing algorithm to solve the green vehicle routing and scheduling problem with hierarchical objectives and weighted tardiness. Applied Soft Computing, 34, 372-388. https://doi.org/10.1016/j.asoc.2015.04.054
    Xiao, Y. and Konak, A. (2017). A genetic algorithm with exact dynamic programming for the green vehicle routing and scheduling problem. Journal of Cleaner Production, 167, 1450-1463. https://doi.org/10.1016/j.jclepro.2016.11.115
    Xiao, Y., Zhao, Q., Kaku, I. and Xu, Y. (2012). Development of a fuel consumption optimization model for the capacitated vehicle routing problem. Computers & Operations Research, 39(7), 1419-1431. https://doi.org/10.1016/j.cor.2011.08.013
    Xiao, Y., Zuo, X., Huang, J., Konak, A. and Xu, Y. (2020). The continuous pollution routing problem. Applied Mathematics and Computation, 387, 125072. https://doi.org/10.1016/j.amc.2020.125072
    Xin, L., Song, W., Cao, Z. and Zhang, J. (2021). Step-wise deep learning models for solving routing problems. IEEE Transactions on Industrial Informatics, 17(7), 4861-4871. https://doi.org/10.1109/tii.2020.3031409
    Xu, Y., Fang, M., Chen, L., Xu, G., Du, Y. and Zhang, C. (2022). Reinforcement learning with multiple relational attention for solving vehicle routing problems. IEEE Transactions on Cybernetics, 52(10), 11107-11120. https://doi.org/10.1109/tcyb.2021.3089179
    Yang, S. and Xu, Z. (2021). Intelligent scheduling for permutation flow shop with dynamic job arrival via deep reinforcement learning. 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). 2672-2677. https://doi.org/10.1109/iaeac50856.2021.9390893
    Yin, W., Kann, K., Yu, M. and Schütze, H. (2017). Comparative study of CNN and RNN for natural language processing. arXiv. https://doi.org/10.48550/arXiv.1702.01923
    Yin, R. and Lu, P. (2022). A cluster-first route-second constructive heuristic method for emergency logistics scheduling in urban transport networks. Sustainability, 14(4), 2301. https://doi.org/10.3390/su14042301
    Zeddam, B., Belkaid, F. and Bennekrouf, M. (2020). Solving the production-routing problem with energy consideration: a case study from the furniture industry. 2020 IEEE 13th International Colloquium of Logistics and Supply Chain Management (LOGISTIQUA), 1-6 https://doi.org/10.1109/logistiqua49782.2020.9353903
    Zhang, J., Zhao, Y., Xue, W. and Li, J. (2015). Vehicle routing problem with fuel consumption and carbon emission. International Journal of Production Economics, 170 Part A, 234-242. https://doi.org/10.1016/j.ijpe.2015.09.031
    Zhao, J., Mao, M., Zhao, X. and Zou, J. (2021a). A hybrid of deep reinforcement learning and local search for the vehicle routing problems. IEEE Transactions on Intelligent Transportation Systems, 22(11), 7208-7218. https://doi.org/10.1109/tits.2020.3003163
    Zhao, Y., Wang, Y., Tan, Y., Zhang, J. and Yu, H. (2021b). Dynamic jobshop scheduling algorithm based on deep Q network. IEEE Access, 9, 122995-123011. https://doi.org/10.1109/access.2021.3110242

    無法下載圖示 全文公開日期 2026/02/08 (校內網路)
    全文公開日期 2026/02/08 (校外網路)
    全文公開日期 2026/02/08 (國家圖書館:臺灣博碩士論文系統)
    QR CODE