簡易檢索 / 詳目顯示

研究生: 黃涵琳
Han-Lin Huang
論文名稱: 在Kubernetes集群中使用機器學習的動態資源分配框架
A Dynamic Resource Allocation Framework in a Kubernetes Cluster using Machine Learning
指導教授: 呂永和
Yung-Ho Leu
口試委員: 楊維寧
Wei-Ning Yang
陳雲岫
Yun-Shiow Chen
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 56
中文關鍵詞: Kubernetes自動擴展資源分配MLOps時間序列分析機器學習
外文關鍵詞: Kubernetes, Autoscaling, Resource Allocation, MLOps, Time-series Analysis, Machine Learning
相關次數: 點閱:216下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,許多開發人員選擇在雲端環境部署他們的應用程序,而容器技術因其輕巧、啟動速度快、佔用較少資源且成本較低等優點而變得越來越受歡迎。Kubernetes是一個開源的容器編排平台,提供了自動擴展的功能,通過反應式的方法有效應對客戶的動態資源需求。現有的研究表明,使用深度神經網絡模型開發自定義資源自動調節器可以進一步提高可擴展性。而一些研究強調了在新數據量增加時更新機器學習模型的重要性,以確保機器學習模型和現實狀況一致。
    在這篇論文中,我們提出了一個框架,利用Kubernetes並針對網站的主動自定義資源自動調節器。利用長短期記憶(LSTM)和雙向LSTM(Bi-LSTM)神經網絡模型,在網站運行時動態預測其工作負載。此外,通過整合新進數據和舊數據,持續更新機器學習模型,使得模型能夠持續適應並確保可以更準確地符合現實情境。實驗結果表明,在自動調節器中使用最新模型預測工作負載相比使用預訓練模型略為準確且減少資源閒置。


    In recent years, many developers have chosen cloud environments to deploy their applications, while container technology has gained popularity due to its lightweight nature, quick launch capabilities, minimal resource consumption, and cost reduction. Kubernetes, an open-source container orchestration platform, provides an autocaling feature that efficiently addresses the dynamic resource requirements of clients by using a reactive approach. Existing research has shown that using deep neural network models to develop custom resource autoscaler can improve scalability further. Several studies have emphasized the importance of updating machine learning (ML) models as new data volumes increase to ensure their alignment with evolving real-world conditions.
    In this thesis, we propose a framework which leverages Kubernetes to facilitate proactive custom resource autoscaling for web applications. It utilizes Long Short-Term Memory (LSTM) and Bidirectional LSTM (Bi-LSTM) neural network models to dynamically predict the workload of the website during runtime. Furthermore, it incorporates continuous updating of ML models by integrating newly incoming and old data, enabling the models to continually adapt and ensuring accurate handling of real-world scenarios. The experimental results indicate that using up-to-date models for workload prediction in the autoscaler yields slightly more accurate results and reduces resource idle time compared to using pre-trained models.

    摘要 i ABSTRACT ii ACKNOWLEDGEMENT iii LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 1.1 Research Background 1 1.2 Research Motivation 3 1.3 Research Method 4 1.4 Research Overview 4 Chapter 2 Related Work 6 2.1 6 2.1.1 Statistical Techniques 6 2.1.2 Machine Learning Techniques 7 2.1.3 Rule-based Techniques 8 2.1.4 Control Theory Techniques 9 2.2 MLOps 10 Chapter 3 Framework Design and Implementation 13 3.1 System Overview 13 3.3.1 Phase 1 14 3.3.2 Phase 2 14 3.3.3 Phase 3 15 3.2 Dataset 15 3.3 ML Model 16 3.3.1 LSTM 16 3.3.2 Bi-LSTM 18 3.3.3 Implementation Details 18 3.3.4 Model Performance Evaluation 20 3.3.5 Model Deployment 21 3.4 Custom Proactive Autoscaler 21 3.4.1 Monitor phase 22 3.4.2 Analyze phase - Model Selection Service 22 3.4.3 Plan phase - Adaptive Management Service 23 3.4.4 Execute phase 25 3.4.5 Autoscaler Performance Evaluation 25 3.5 Key Components in Kubernetes Cluster 26 3.5.1 Web Application 27 3.5.2 Continuous Training and Continuous Deployment 27 3.5.3 Monitoring Metrics 30 Chapter 4 Experiment 32 4.1 Experimental Setting 32 4.1.1 Testbed 32 4.1.2 Virtual Machine 32 4.1.3 Software 33 4.2 Simulating Real Web Request 34 4.3 Result 34 4.3.1 LSTM Model Performance 34 4.3.2 Bi-LSTM Model Performance 37 4.3.3 Custom Autoscaler Performance 39 4.3.4 Web Workload Distribution Visualization 40 Chapter 5 Conclusion and Future Work 42 5.1 Conclusion 42 5.2 Future Work 42 Reference 43

    [1] P. Singh, P. Gupta, K. Jyoti, and A. Nayyar, “Research on Auto-Scaling of Web Applications in Cloud: Survey, Trends and Future Directions,” Scalable Computing: Practice and Experience, vol. 20, no. 2, pp. 399–432, 2019.
    [2] H. Zhang, G. Jiang, K. Yoshihira, H. Chen and A. Saxena, “Intelligent Workload Factoring for a Hybrid Cloud Computing Model,” in 2009 Congress on Services - I, 2009, pp. 701-708.
    [3] W. Fang, Z. Lu, J. Wu and Z. Cao, “RPPS: A Novel Resource Prediction and Provisioning Scheme in Cloud Data Center,” 2012 IEEE Ninth International Conference on Services Computing, pp. 609-616, 2012.
    [4] R. N. Calheiros, E. Masoumi, R. Ranjan, and R. Buyya, “Workload Prediction Using ARIMA Model and Its Impact on Cloud Applications’ QoS,” IEEE Transactions on Cloud Computing, vol. 3, no. 4, pp. 449–458, 2015.
    [5] H. T. Ciptaningtyas, B. J. Santoso and M. F. Razi, “Resource elasticity controller for Docker-based web applications,” in 2017 11th International Conference on Information & Communication Technology and System (ICTS), 2017, pp. 193-196.
    [6] V. R. Messias, J. C. Estrella, R. S. Ehlers, M. J. Santana, R. H. C. Santana, and S. Reiff-Marganiec, “Combining time series prediction models using genetic algorithm to autoscaling Web applications hosted in the cloud infrastructure,” Neural Computing and Applications, vol. 27, no. 8, pp. 2383–2406, 2016.
    [7] M. Imdoukh, I. Ahmad, and M. Gh. Alfailakawi, “Machine learning-based auto-scaling for containerized applications,” Neural Computing and Applications, vol. 32, pp. 9745–9760, 2020.
    [8] I. Prachitmutita, W. Aittinonmongkol, N. Pojjanasuksakul, M. Supattatham and P. Padungweang, “Auto-scaling microservices on IaaS under SLA with cost-effective framework,” in 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI), 2018, pp. 583-588.
    [9] X. Tang, Q. Liu, Y. Dong, J. Han and Z. Zhang, “Fisher: An Efficient Container Load Prediction Model with Deep Neural Network in Clouds,” in 2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), 2018, pp. 199-206.
    [10] M. Yan, X. Liang, Z. Lu, J. Wu, and W. Zhang, “HANSEL: Adaptive horizontal scaling of microservices using Bi-LSTM,” Applied Soft Computing, vol. 105, pp. 107216, 2021.
    [11] L. Toka, G. Dobreff, B. Fodor and B. Sonkoly, “Machine Learning-Based Scaling Management for Kubernetes Edge Clusters,” IEEE Transactions on Network and Service Management, vol. 18, no. 1, pp. 958-972, 2021.
    [12] C. Kan, “DoCloud: An elastic cloud platform for Web applications based on Docker,” in 2016 18th International Conference on Advanced Communication Technology (ICACT), PyeongChang, Korea (South), 2016, pp. 1-1.
    [13] Y. Li and Y. Xia, “Auto-scaling web applications in hybrid cloud based on docker,” in 2016 5th International Conference on Computer Science and Network Technology (ICCSNT), 2016, pp. 75-79.
    [14] L. Baresi, S. Guinea, A. Leva, and G. Quattrocchi, “A discrete-time feedback controller for containerized cloud applications,” in Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering (FSE 2016), New York, 2016, pp. 217–228.
    [15] S. Wu, D. H. Zhang, B. Yan, F. Guo, and D. Sheng, “Auto-Scaling Web Application in Docker Based on Gray Prediction,” in Proceedings of the 2018 International Conference on Network, Communication, Computer Engineering (NCCE 2018), 2018, pp. 169-174.
    [16] F. Klinaku, M. Frank, and S. Becker, “CAUS: An Elasticity Controller for a Containerized Microservice,” in Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, 2018, pp. 93-98.
    [17] X. Tang, F. Zhang, X. Li, S. U. Khan, and Z. Li, “Quantifying Cloud Elasticity with Container-Based Autoscaling,” in 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress(DASC/PiCom/DataCom/CyberSciTech), 2017, pp. 853–860.
    [18] A. Sangpetch, O. Sangpetch, N. Juangmarisakul, and S. Warodom, “Thoth: Automatic Resource Management with Machine Learning for Container-based Cloud Platform,” in Proceedings of the 7th International Conference on Cloud Computing and Services Science CLOSER, 2017, pp. 103–111.
    [19] W. H. Kim, J.-S. Lee, and E.-N. Huh, “Study on proactive auto scaling for instance through the prediction of network traffic on the container environment,” in IMCOM '17: Proceedings of the 11th International Conference on Ubiquitous Information Management and Communication, 2017, pp. 17:1–17:8.
    [20] Y. Al-Dhuraibi, F. Paraiso, N. Djarallah and P. Merle, “Elasticity in Cloud Computing: State of the Art and Research Challenges,” IEEE Transactions on Services Computing, vol. 11, no. 2, pp. 430-447, 2018.
    [21] G. Rattihalli, M. Govindaraju, H. Lu and D. Tiwari, “Exploring Potential for Non-Disruptive Vertical Auto Scaling and Resource Estimation in Kubernetes,” in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), 2019, pp. 33-40.
    [22] H. Khazaei, R. Ravichandiran, B.-C. Park, H. Bannazadeh, A. Tizghadam, and A. Leon-Garcia, “Elascale: autoscaling and monitoring as a service,” in Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering, 2017, pp. 234–240.
    [23] A. Ali-Eldin, J. Tordsson and E. Elmroth, “An adaptive hybrid elasticity controller for cloud infrastructures,” in 2012 IEEE Network Operations and Management Symposium, 2012, pp. 204-212.
    [24] F. Rossi, M. Nardelli and V. Cardellini, “Horizontal and Vertical Scaling of Container-Based Applications Using Reinforcement Learning,” in 2019 IEEE 12th International Conference on Cloud Computing (CLOUD), 2019, pp. 329-338.
    [25] G. Yu, P. Chen and Z. Zheng, “Microscaler: Automatic Scaling for Microservices with an Online Learning Approach,” in 2019 IEEE International Conference on Web Services (ICWS), 2019, pp. 68-75.
    [26] K. Rzadca, P. Findeisen, J. Świderski, P. Zych, P. Broniek, J. Kusmierek, P.K. Nowak, B. Strack, P. Witusowski, S. Hand and J. Wilkes, “Autopilot: Workload Autoscaling at Google,” 2020. [Online]. Available: https://dl.acm.org/doi/10.1145/3342195.3387524
    [27] Y. Mao, Y. Fu, W. Zheng, L. Cheng, Q. Liu and D. Tao, “Speculative Container Scheduling for Deep Learning Applications in a Kubernetes Cluster,” IEEE Systems Journal, vol. 16, no. 3, pp. 3770-3781, 2022.
    [28] Horizontal Pod Autoscaler. Available online: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale
    [29] I. Prapas, B. Derakhshan, A. R. Mahdiraji, and V. Markl, “Continuous Training and Deployment of Deep Learning Models,” Datenbank-spektrum, vol. 21, pp. 203–212, 2021.
    [30] V. Lomonaco, K. Desai, E. Culurciello and D. Maltoni, “Continual Reinforcement Learning in 3D Non-stationary Environments,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020, pp. 999-1008.
    [31] Y. Huang, H. Zhang, Y. Wen, P. Sun and N. B. D. TA, “Modelci-e: Enabling continual learning in deep learning serving systems,” CoRR, vol. abs/2106.03122, 2021, [Online]. Available: https://arxiv.org/abs/2106.03122
    [32] R. Semola, V. Lomonaco, and D. Bacciu, “Continual-Learning-as-a-Service (CLaaS): On-Demand Efficient Adaptation of Predictive Models,” ArXiv, 2022, abs/2206.06957
    [33] M. Antonini, M. Pincheira, M. Vecchio, and F. Antonelli, “An Adaptable and Unsupervised TinyML Anomaly Detection System for Extreme Industrial Environments,” Sensors, vol. 23, no. 4, pp. 2344, 2023.
    [34] R. Subramanya, S. Sierla, and V. Vyatkin, “From DevOps to MLOps: Overview and Application to Electricity Market Forecasting,” Applied Sciences, vol. 12, no. 19, pp. 9851, 2022.
    [35] S. Garg, P. Pundir, G. Rathee, P. K. Gupta, S. Garg and S. Ahlawat, “On Continuous Integration / Continuous Delivery for Automated Deployment of Machine Learning Models using MLOps,” in 2021 IEEE Fourth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 2021, pp. 25-28.
    [36] D. Sculley, G. Holt, D. Golovin, E. Davydov, T. Phillips, D. Ebner, V. Chaudhary, M. Young, J.-F. Crespo and D. Dennison, “Hidden Technical Debt in Machine Learning Systems,” in Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, 2015, pp. 2503–2511.
    [37] D. Kreuzberger, N. Kühl and S. Hirschl, “Machine Learning Operations (MLOps): Overview, Definition, and Architecture,” in IEEE Access, vol. 11, pp. 31866-31879, 2022.
    [38] TensorFlow Extended (TFX). Available online: https://www.tensorflow.org/
    [39] Kubeflow. Available online: https://www.kubeflow.org/
    [40] MLflow. Available online: https://mlflow.org/
    [41] AWS SageMaker. Available online: https://aws.amazon.com/sagemaker/?nc1=h_ls
    [42] Azure Machine Learning. Available online: https://azure.microsoft.com/en-us/products/machine-learning
    [43] NASA-HTTP logs. Available online: https://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html
    [44] C. Qu, R. N. Calheiros, and R. Buyya, “Auto-Scaling Web Applications in Clouds: A Taxonomy and Survey,” ACM Computing Surveys, vol. 51, no. 4, pp. 1–33, 2018.
    [45] N.-M. Dang-Quang and M. Yoo, “Deep Learning-Based Autoscaling Using Bidirectional Long Short-Term Memory for Kubernetes,” Applied Sciences, vol. 11, no. 9, pp. 3835, 2021.
    [46] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997.
    [47] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, 1997.
    [48] Kubernetes. Available online: https://kubernetes.io/
    [49] CRI-O. Available online: https://cri-o.io/
    [50] Buildah. Available online: https://buildah.io/
    [51] Prometheus. Available online: https://prometheus.io/
    [52] Grafana. Available online: https://grafana.com/
    [53] Helm. Available online: https://helm.sh/
    [54] kube-prometheus-stack. Available online: https://github.com/prometheus-operator/kube-prometheus
    [55] A. Bauer, J. Grohmann, N. Herbst, and S. Kounev, “On the Value of Service Demand Estimation for Auto-scaling,” in Measurement, Modelling and Evaluation of Computing Systems, 2018, pp. 142–156.

    無法下載圖示 全文公開日期 2025/07/21 (校內網路)
    全文公開日期 2025/07/21 (校外網路)
    全文公開日期 2025/07/21 (國家圖書館:臺灣博碩士論文系統)
    QR CODE