在Kubernetes集群中使用機器學習的動態資源分配框架｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃涵琳 Han-Lin Huang
論文名稱：	在Kubernetes集群中使用機器學習的動態資源分配框架 A Dynamic Resource Allocation Framework in a Kubernetes Cluster using Machine Learning
指導教授：	呂永和 Yung-Ho Leu
口試委員:	楊維寧 Wei-Ning Yang 陳雲岫 Yun-Shiow Chen
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理系 Department of Information Management
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	56
中文關鍵詞：	Kubernetes 、自動擴展、資源分配、MLOps 、時間序列分析、機器學習
外文關鍵詞：	Kubernetes, Autoscaling, Resource Allocation, MLOps, Time-series Analysis, Machine Learning
相關次數：	點閱：216 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，許多開發人員選擇在雲端環境部署他們的應用程序，而容器技術因其輕巧、啟動速度快、佔用較少資源且成本較低等優點而變得越來越受歡迎。Kubernetes是一個開源的容器編排平台，提供了自動擴展的功能，通過反應式的方法有效應對客戶的動態資源需求。現有的研究表明，使用深度神經網絡模型開發自定義資源自動調節器可以進一步提高可擴展性。而一些研究強調了在新數據量增加時更新機器學習模型的重要性，以確保機器學習模型和現實狀況一致。
在這篇論文中，我們提出了一個框架，利用Kubernetes並針對網站的主動自定義資源自動調節器。利用長短期記憶（LSTM）和雙向LSTM（Bi-LSTM）神經網絡模型，在網站運行時動態預測其工作負載。此外，通過整合新進數據和舊數據，持續更新機器學習模型，使得模型能夠持續適應並確保可以更準確地符合現實情境。實驗結果表明，在自動調節器中使用最新模型預測工作負載相比使用預訓練模型略為準確且減少資源閒置。

In recent years, many developers have chosen cloud environments to deploy their applications, while container technology has gained popularity due to its lightweight nature, quick launch capabilities, minimal resource consumption, and cost reduction. Kubernetes, an open-source container orchestration platform, provides an autocaling feature that efficiently addresses the dynamic resource requirements of clients by using a reactive approach. Existing research has shown that using deep neural network models to develop custom resource autoscaler can improve scalability further. Several studies have emphasized the importance of updating machine learning (ML) models as new data volumes increase to ensure their alignment with evolving real-world conditions.
In this thesis, we propose a framework which leverages Kubernetes to facilitate proactive custom resource autoscaling for web applications. It utilizes Long Short-Term Memory (LSTM) and Bidirectional LSTM (Bi-LSTM) neural network models to dynamically predict the workload of the website during runtime. Furthermore, it incorporates continuous updating of ML models by integrating newly incoming and old data, enabling the models to continually adapt and ensuring accurate handling of real-world scenarios. The experimental results indicate that using up-to-date models for workload prediction in the autoscaler yields slightly more accurate results and reduces resource idle time compared to using pre-trained models.

摘要    i
ABSTRACT    ii
ACKNOWLEDGEMENT    iii
LIST OF FIGURES    vi
LIST OF TABLES    vii
Chapter 1    Introduction    1
1    Research Background    1
2    Research Motivation    3
3    Research Method    4
4    Research Overview    4
Chapter 2    Related Work    6
1    6
1.1    Statistical Techniques    6
1.2    Machine Learning Techniques    7
1.3    Rule-based Techniques    8
1.4    Control Theory Techniques    9
2    MLOps    10
Chapter 3    Framework Design and Implementation    13
1    System Overview    13
3.1    Phase 1    14
3.2    Phase 2    14
3.3    Phase 3    15
2    Dataset    15
3    ML Model    16
3.1    LSTM    16
3.2    Bi-LSTM    18
3.3    Implementation Details    18
3.4    Model Performance Evaluation    20
3.5    Model Deployment    21
4    Custom Proactive Autoscaler    21
4.1    Monitor phase    22
4.2    Analyze phase - Model Selection Service    22
4.3    Plan phase - Adaptive Management Service    23
4.4    Execute phase    25
4.5    Autoscaler Performance Evaluation    25
5    Key Components in Kubernetes Cluster    26
5.1    Web Application    27
5.2    Continuous Training and Continuous Deployment    27
5.3    Monitoring Metrics    30
Chapter 4    Experiment    32
1    Experimental Setting    32
1.1    Testbed    32
1.2    Virtual Machine    32
1.3    Software    33
2    Simulating Real Web Request    34
3    Result    34
3.1    LSTM Model Performance    34
3.2    Bi-LSTM Model Performance    37
3.3    Custom Autoscaler Performance    39
3.4    Web Workload Distribution Visualization    40
Chapter 5    Conclusion and Future Work    42
1    Conclusion    42
2    Future Work    42
Reference    43

                                

[1] P. Singh, P. Gupta, K. Jyoti, and A. Nayyar, “Research on Auto-Scaling of Web Applications in Cloud: Survey, Trends and Future Directions,” Scalable Computing: Practice and Experience, vol. 20, no. 2, pp. 399–432, 2019.
[2] H. Zhang, G. Jiang, K. Yoshihira, H. Chen and A. Saxena, “Intelligent Workload Factoring for a Hybrid Cloud Computing Model,” in 2009 Congress on Services - I, 2009, pp. 701-708.
[3] W. Fang, Z. Lu, J. Wu and Z. Cao, “RPPS: A Novel Resource Prediction and Provisioning Scheme in Cloud Data Center,” 2012 IEEE Ninth International Conference on Services Computing, pp. 609-616, 2012.
[4] R. N. Calheiros, E. Masoumi, R. Ranjan, and R. Buyya, “Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS,” IEEE Transactions on Cloud Computing, vol. 3, no. 4, pp. 449–458, 2015.
[5] H. T. Ciptaningtyas, B. J. Santoso and M. F. Razi, “Resource elasticity controller for Docker-based web applications,” in 2017 11th International Conference on Information & Communication Technology and System (ICTS), 2017, pp. 193-196.
[6] V. R. Messias, J. C. Estrella, R. S. Ehlers, M. J. Santana, R. H. C. Santana, and S. Reiff-Marganiec, “Combining time series prediction models using genetic algorithm to autoscaling Web applications hosted in the cloud infrastructure,” Neural Computing and Applications, vol. 27, no. 8, pp. 2383–2406, 2016.
[7] M. Imdoukh, I. Ahmad, and M. Gh. Alfailakawi, “Machine learning-based auto-scaling for containerized applications,” Neural Computing and Applications, vol. 32, pp. 9745–9760, 2020.
[8] I. Prachitmutita, W. Aittinonmongkol, N. Pojjanasuksakul, M. Supattatham and P. Padungweang, “Auto-scaling microservices on IaaS under SLA with cost-effective framework,” in 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI), 2018, pp. 583-588.
[9] X. Tang, Q. Liu, Y. Dong, J. Han and Z. Zhang, “Fisher: An Efficient Container Load Prediction Model with Deep Neural Network in Clouds,” in 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), 2018, pp. 199-206.
[10] M. Yan, X. Liang, Z. Lu, J. Wu, and W. Zhang, “HANSEL: Adaptive horizontal scaling of microservices using Bi-LSTM,” Applied Soft Computing, vol. 105, pp. 107216, 2021.
[11] L. Toka, G. Dobreff, B. Fodor and B. Sonkoly, “Machine Learning-Based Scaling Management for Kubernetes Edge Clusters,” IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 958-972, 2021.
[12] C. Kan, “DoCloud: An elastic cloud platform for Web applications based on Docker,” in 2016 18th International Conference on Advanced Communication Technology (ICACT), PyeongChang, Korea (South), 2016, pp. 1-1.
[13] Y. Li and Y. Xia, “Auto-scaling web applications in hybrid cloud based on docker,” in 2016 5th International Conference on Computer Science and Network Technology (ICCSNT), 2016, pp. 75-79.
[14] L. Baresi, S. Guinea, A. Leva, and G. Quattrocchi, “A discrete-time feedback controller for containerized cloud applications,” in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016), New York, 2016, pp. 217–228.
[15] S. Wu, D. H. Zhang, B. Yan, F. Guo, and D. Sheng, “Auto-Scaling Web Application in Docker Based on Gray Prediction,” in Proceedings of the 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018), 2018, pp. 169-174.
[16] F. Klinaku, M. Frank, and S. Becker, “CAUS: An Elasticity Controller for a Containerized Microservice,” in Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, 2018, pp. 93-98.
[17] X. Tang, F. Zhang, X. Li, S. U. Khan, and Z. Li, “Quantifying Cloud Elasticity with Container-Based Autoscaling,” in 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech), 2017, pp. 853–860.
[18] A. Sangpetch, O. Sangpetch, N. Juangmarisakul, and S. Warodom, “Thoth: Automatic Resource Management with Machine Learning for Container-based Cloud Platform,” in Proceedings of the 7th International Conference on Cloud Computing and Services Science CLOSER, 2017, pp. 103–111.
[19] W. H. Kim, J.-S. Lee, and E.-N. Huh, “Study on proactive auto scaling for instance through the prediction of network traffic on the container environment,” in IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication, 2017, pp. 17:1–17:8.
[20] Y. Al-Dhuraibi, F. Paraiso, N. Djarallah and P. Merle, “Elasticity in Cloud Computing: State of the Art and Research Challenges,” IEEE Transactions on Services Computing, vol. 11, no. 2, pp. 430-447, 2018.
[21] G. Rattihalli, M. Govindaraju, H. Lu and D. Tiwari, “Exploring Potential for Non-Disruptive Vertical Auto Scaling and Resource Estimation in Kubernetes,” in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), 2019, pp. 33-40.
[22] H. Khazaei, R. Ravichandiran, B.-C. Park, H. Bannazadeh, A. Tizghadam, and A. Leon-Garcia, “Elascale: autoscaling and monitoring as a service,” in Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, 2017, pp. 234–240.
[23] A. Ali-Eldin, J. Tordsson and E. Elmroth, “An adaptive hybrid elasticity controller for cloud infrastructures,” in 2012 IEEE Network Operations and Management Symposium, 2012, pp. 204-212.
[24] F. Rossi, M. Nardelli and V. Cardellini, “Horizontal and Vertical Scaling of Container-Based Applications Using Reinforcement Learning,” in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), 2019, pp. 329-338.
[25] G. Yu, P. Chen and Z. Zheng, “Microscaler: Automatic Scaling for Microservices with an Online Learning Approach,” in 2019 IEEE International Conference on Web Services (ICWS), 2019, pp. 68-75.
[26] K. Rzadca, P. Findeisen, J. Świderski, P. Zych, P. Broniek, J. Kusmierek, P.K. Nowak, B. Strack, P. Witusowski, S. Hand and J. Wilkes, “Autopilot: Workload Autoscaling at Google,” 2020. [Online]. Available: https://dl.acm.org/doi/10.1145/3342195.3387524
[27] Y. Mao, Y. Fu, W. Zheng, L. Cheng, Q. Liu and D. Tao, “Speculative Container Scheduling for Deep Learning Applications in a Kubernetes Cluster,” IEEE Systems Journal, vol. 16, no. 3, pp. 3770-3781, 2022.
[28] Horizontal Pod Autoscaler. Available online: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale
[29] I. Prapas, B. Derakhshan, A. R. Mahdiraji, and V. Markl, “Continuous Training and Deployment of Deep Learning Models,” Datenbank-spektrum, vol. 21, pp. 203–212, 2021.
[30] V. Lomonaco, K. Desai, E. Culurciello and D. Maltoni, “Continual Reinforcement Learning in 3D Non-stationary Environments,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020, pp. 999-1008.
[31] Y. Huang, H. Zhang, Y. Wen, P. Sun and N. B. D. TA, “Modelci-e: Enabling continual learning in deep learning serving systems,” CoRR, vol. abs/2106.03122, 2021, [Online]. Available: https://arxiv.org/abs/2106.03122
[32] R. Semola, V. Lomonaco, and D. Bacciu, “Continual-Learning-as-a-Service (CLaaS): On-Demand Efficient Adaptation of Predictive Models,” ArXiv, 2022, abs/2206.06957
[33] M. Antonini, M. Pincheira, M. Vecchio, and F. Antonelli, “An Adaptable and Unsupervised TinyML Anomaly Detection System for Extreme Industrial Environments,” Sensors, vol. 23, no. 4, pp. 2344, 2023.
[34] R. Subramanya, S. Sierla, and V. Vyatkin, “From DevOps to MLOps: Overview and Application to Electricity Market Forecasting,” Applied Sciences, vol. 12, no. 19, pp. 9851, 2022.
[35] S. Garg, P. Pundir, G. Rathee, P. K. Gupta, S. Garg and S. Ahlawat, “On Continuous Integration / Continuous Delivery for Automated Deployment of Machine Learning Models using MLOps,” in 2021 IEEE Fourth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 2021, pp. 25-28.
[36] D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo and D. Dennison, “Hidden Technical Debt in Machine Learning Systems,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, 2015, pp. 2503–2511.
[37] D. Kreuzberger, N. Kühl and S. Hirschl, “Machine Learning Operations (MLOps): Overview, Definition, and Architecture,” in IEEE Access, vol. 11, pp. 31866-31879, 2022.
[38] TensorFlow Extended (TFX). Available online: https://www.tensorflow.org/
[39] Kubeflow. Available online: https://www.kubeflow.org/
[40] MLflow. Available online: https://mlflow.org/
[41] AWS SageMaker. Available online: https://aws.amazon.com/sagemaker/?nc1=h_ls
[42] Azure Machine Learning. Available online: https://azure.microsoft.com/en-us/products/machine-learning
[43] NASA-HTTP logs. Available online: https://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html
[44] C. Qu, R. N. Calheiros, and R. Buyya, “Auto-Scaling Web Applications in Clouds: A Taxonomy and Survey,” ACM Computing Surveys, vol. 51, no. 4, pp. 1–33, 2018.
[45] N.-M. Dang-Quang and M. Yoo, “Deep Learning-Based Autoscaling Using Bidirectional Long Short-Term Memory for Kubernetes,” Applied Sciences, vol. 11, no. 9, pp. 3835, 2021.
[46] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[47] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, 1997.
[48] Kubernetes. Available online: https://kubernetes.io/
[49] CRI-O. Available online: https://cri-o.io/
[50] Buildah. Available online: https://buildah.io/
[51] Prometheus. Available online: https://prometheus.io/
[52] Grafana. Available online: https://grafana.com/
[53] Helm. Available online: https://helm.sh/
[54] kube-prometheus-stack. Available online: https://github.com/prometheus-operator/kube-prometheus
[55] A. Bauer, J. Grohmann, N. Herbst, and S. Kounev, “On the Value of Service Demand Estimation for Auto-scaling,” in Measurement, Modelling and Evaluation of Computing Systems, 2018, pp. 142–156.

全文公開日期 2025/07/21 (校內網路)
全文公開日期 2025/07/21 (校外網路)
全文公開日期 2025/07/21 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文