簡易檢索 / 詳目顯示

研究生: 石欣玉
Hsin-Yu Shih
論文名稱: 異質雲端環境下基於節點負載之動態任務調度機制
Dynamic Task Allocation Scheme Based on Node Workload in a Heterogeneous Cloud Environment
指導教授: 呂政修
Jenq-Shiou Leu
口試委員: 孫敏德
Min-Te Sun
阮聖彰
Shanq-Jang Ruan
鄭瑞光
Ray-Guang Cheng
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 中文
論文頁數: 44
中文關鍵詞: 雲端運算大規模數據處理映射化簡(MapReduce)Hadoop運算資源任務調度
外文關鍵詞: Cloud Computing, Large-Scale Data Processing, MapReduce, Hadoop, Resources, Task Scheduling
相關次數: 點閱:311下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在大規模數據需要處理的環境下,雲端運算提供了一種非常具有彈性的運算基礎架構。由Google所提出的MapReduce軟體框架被廣泛應用在大規模資料平行運算處理上,並成為雲端運算的核心技術,而Hadoop就是由MapReduce架構啟發的開放原始碼計劃。目前也有許多組織利用Hadoop做為雲端運算的平台,實現大規模數據的平行運算甚至參與Hadoop的研究與開發。Hadoop通常在同質節點環境下被討論,但由於IT產業的變遷,Hadoop集群中的節點可能會由很多不同性能的電腦組成。因此,在異質節點雲端環境下,系統資源的有效管理將會是增進資源使用率與提升MapReduce運算效能的關鍵。然而,Hadoop原始的任務調度機制只依據節點上的靜態slot配置決定,並不考慮節點的計算資源與即時負載量,如CPU、記憶體、儲存裝置或網路的使用狀況。本論文將實際架設Hadoop運算環境並提出一個動態任務調度機制,考慮每個節點上的即時資源負載量,有效的利用運算資源並防止資源的過度使用。經由實驗結果證明,所提出的動態任務調度機制在異質節點雲端環境下,可以有效的利用運算資源並提升整體運算效能。


    Cloud computing features a flexible computing infrastructure for large-scale data processing. MapReduce is becoming a leading large-scale data processing model providing a logical framework for cloud computing and Hadoop, an open-source implementation of MapReduce, is a common platform to realize such kind of parallel computing model. Nodes in the current Hadoop environment are normally homogeneous. Efficient resource management in heterogeneous clouds is crucial for improving the performance of MapReduce applications and the utilization of resources. However, the original scheduling scheme in Hadoop assign tasks to each node based on the fixed and static number of slots, without considering the physical workload of comprehensive computing resources, such as the CPU utilization, memory usuage, network bandwidth on each working node. This study aims at proposing a dynamic task allocation scheme by considering the physical workload on each node so as to prevent resource underutilization in the cloud computing environment. The evaluation results show the proposed scheme can improve the overall computation efficiency among the heterogeneous nodes in cloud.

    論文摘要 I ABSTRACT II 誌謝 III 目錄 IV 圖片索引 VI 表格索引 VII 第 1 章 緒論 1 1.1 前言 1 1.2 研究動機及目的 2 1.3 論文架構 3 第 2 章 研究背景 4 2.1 相關研究 4 2.2 MapReduce架構 9 2.3 Hadoop簡介 11 2.3.1 HDFS 12 2.3.2 Hadoop MapReduce 13 第 3 章 系統架構與設計 15 3.1 Hadoop Slot配置-SCR 15 3.2 非Hadoop任務所產生之系統過載情況 18 3.3 基於節點負載之動態任務調度機制 20 第 4 章 效能評估 25 4.1 系統實作 25 4.1.1 Hadoop配置與安裝 26 4.1.2 Ganglia Monitoring System 30 4.2 測試環境設定 32 4.3 效能評估 34 4.3.1 Hadoop Slot配置-SCR 34 4.3.2 基於節點負載之slot配置與動態任務調度機制 38 4.3.3 非Hadoop負載狀況效能評比 39 第 5 章 結論及未來展望 41 參考文獻 43

    [1] J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Commun. ACM, vol. 51, pp. 107-113, 2008.
    [2] Apache Software Foundation, “Hadoop,” 2007. Available: http://Hadoop.apache.org/core
    [3] S. Ghemawat, H. Gobioff, and S.-T. Leung, "The Google file system," SIGOPS Oper. Syst. Rev., vol. 37, pp. 29-43, 2003.
    [4] M. Hong, Z. Zhenzhong, Z. Bin, X. Limin, and R. Li, "Towards Deploying Elastic Hadoop in the Cloud," in Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), 2011 International Conference on, 2011, pp. 476-482.
    [5] H. C. Lim, S. Babu, and J. S. Chase, "Automated control for elastic storage," presented at the Proceedings of the 7th international conference on Autonomic computing, Washington, DC, USA, 2010.
    [6] A. J. Younge, G. von Laszewski, W. Lizhe, S. Lopez-Alarcon, and W. Carithers, "Efficient resource management for Cloud computing environments," in Green Computing Conference, 2010 International, 2010, pp. 357-364.
    [7] M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica, "Managing data transfers in computer clusters with orchestra," SIGCOMM Comput. Commun. Rev., vol. 41, pp. 98-109, 2011.
    [8] M. Zaharia, D. Borthakur, J. S. Sarma, K. Elmeleegy, S. Shenker, and I. Stoica, "Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling," presented at the Proceedings of the 5th European conference on Computer systems, Paris, France, 2010.
    [9] D. Jiang, B. C. Ooi, L. Shi, and S. Wu, "The performance of MapReduce: an in-depth study," Proc. VLDB Endow., vol. 3, pp. 472-483, 2010.
    [10] S. Babu, "Towards automatic optimization of MapReduce programs," presented at the Proceedings of the 1st ACM symposium on Cloud computing, Indianapolis, Indiana, USA, 2010.
    [11] K. Kambatla, A. Pathak, and H. Pucha, "Towards optimizing hadoop provisioning in the cloud," presented at the Proceedings of the 2009 conference on Hot topics in cloud computing, San Diego, California, 2009.
    [12] T. Chao, Z. Haojie, H. Yongqiang, and Z. Li, "A Dynamic MapReduce Scheduler for Heterogeneous Workloads," in Grid and Cooperative Computing, 2009. GCC '09. Eighth International Conference on, 2009, pp. 218-224.
    [13] H. Weisong, T. Chao, L. Xiaowei, Q. Hongwei, Z. Li, L. Huaming, Z. Yuezhuo, and Z. Jie, "Multiple-Job Optimization in MapReduce for Heterogeneous Workloads," in Semantics Knowledge and Grid (SKG), 2010 Sixth International Conference on, 2010, pp. 135-140.
    [14] R. Boutaba, L. Cheng, and Q. Zhang, "On Cloud computational models and the heterogeneity challenge," Journal of Internet Services and Applications, vol. 3, pp. 77-86, 2012.
    [15] M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz, and I. Stoica, "Improving MapReduce performance in heterogeneous environments," presented at the Proceedings of the 8th USENIX conference on Operating systems design and implementation, San Diego, California, 2008.
    [16] H.-Y. Shih and J.-S. Leu, "Improving Resource Utilization in a Heterogeneous Cloud Environment," presented at the Communications (APCC), 2012 18th Asia-Pacific Conference on, 2012.
    [17] H.-Y. Shih, J.-J. Huang, and J.-S. Leu, "Dynamic Slot-based Task Scheduling Based on Node Workload in a MapReduce Computation Model," presented at the Anti-Counterfeiting, Security and Identification (ASID), 2012 IEEE International Conference on, 2012.
    [18] C. Castillo, M. Spreitzer, and M. Steinder, "Towards efficient resource management for data-analytic platforms," in Integrated Network Management (IM), 2011 IFIP/IEEE International Symposium on, 2011, pp. 73-80.

    QR CODE