簡易檢索 / 詳目顯示

研究生: 宋立贏
Li-Ying Sung
論文名稱: 基於標籤與動態資源分配之雲端視訊轉碼排程
A label-based dynamic cloud resource allocation method for video transcoding
指導教授: 陳建中
Jiann-Jone Chen
口試委員: 陳建中
Jiann-Jone Chen
謝君偉
Jun-Wei Hsieh
蔡耀弘
Yao-Hong Tsai
吳怡樂
Yi-Leh Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2017
畢業學年度: 106
語文別: 中文
論文頁數: 82
中文關鍵詞: Hadoop YarnMapReduce 工作標籤化排程雲端視訊轉碼類神經網路FFmpeg
外文關鍵詞: Hadoop Yarn, MapReduce job, Label-Based Scheduling, Cloud Video Transcoding, Neural Network, FFmpeg
相關次數: 點閱:322下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 雲端多媒體技術近年來已經廣泛應用到各種裝置與社交網路上,然而因使用者裝置及網路環境不一,因此雲端平台必須透過轉碼 (transcoding)技術把視訊轉換為適於使用者網路與裝置的格式和品質,因為視訊處理屬於高複雜度的運算,虛耗費大量CPU資源。一般可採用品質可適性編碼(scalable video codec),或開發雲端視訊轉碼以縮短處理時間,降低CPU之負擔。為了有效解決異質性網路的問題,MPEG提出MPEG-DASH (Dynamic Adaptive Streaming over HTTP)架構,可以依據用戶環境提供不同檔案格式、解析度、Bit-rate等不同的媒體內容,改善於不同裝置收看的問題。為了提升轉碼效率,本論文使用了雲端叢集運算平台,利用Yarn框架做分散式運算(MapReduce),進行轉碼工作。另使用分散式檔案系統(Hadoop Distributed File System, HDFS)進行片段影音儲存及管理。論文中使用Yarn框架來改善雲端動態任務排程。再利用Label-Based and Comlexity Aware Scheduler NN演算法,根據運算節點的記憶體狀態去分配轉碼工作到相對應的隊列上,給不同使用者獨立排程,調整資源內container數量,使系統保持最佳的負載平衡(Load Balance),如此能讓整體叢集達到較佳的資源使用率。並運用CAS演算法,優先排程複雜度較高的任務,避免複雜度較高的任務過度集中在某個工作節點上,運用FFMpeg取得各影片的特徵,接著使用類神經網路預測影片轉碼時間。另外結合指數平滑推測法,降低整體作業的完成時間。實驗結果顯示,此論文所提出的方法,能有效提升平均CPU使用率達90%以上,而資源使用率亦能維持在98%以上,並縮短約50%的轉碼時間。


    Cloud multimedia related applications can be found by many in recent years. The multimedia cloud has to serve users under different network environments, so that it has to perform scalable video coding or video transcoding to provide bandwidth compatible video bitstreams. The MPEG-DASH (Dynamic Adaptive Streaming over HTTP) framework has been developed to serve multimedia for users with heterogeneous network and devices. As video processing and transcoding are high time complexity procedures, the cloud platform has to perform task scheduling efficiency to provide prompt reply for user requests. In this research, we design video transcoding methods for cloud computer clusters based on the Yarn concurrent MapReduce processing framework. The Hadoop Distributed File System (HDFS) is also utilized in handling video segment storage and management. The Yarn Hadoop framework is adopted for cloud transcoding operations. We proposed a Label-Based and Complexity Aware Scheduler by Neural Network Model, whose control targets are (1) improve the resource utilization rate; and (2) maintaining high load balancing operation. For the first target, the Complexity-Aware Scheduler is designed to assign tasks with the ascending order of task complexity. It can effectively reduce the convoy effect and improve resource utilization rate. By estimate the task complexity through Neural Network Models to re-order task assignment, the resource utilization rates can be further improved. For the second target, the system monitor worker CPU utilization condition through heartbeat mechanism and adaptively adjusts the number of worker slot certificates to maintain high load balancing operations. Both control targets help to improve the video transcoding performance and yield a shorter processing time. Experiments showed that the proposed scheduling method maintains the CPU and resource utilization rates above 90% and 98%, respectively. The total processing time can be shortened to 50% smaller.

    摘要 I Abstract II 致謝 III 目錄 IV 圖目錄 VII 表目錄 X 第一章 緒論 11 1.1 研究背景與動機 11 1.2 研究方法概述 12 1.3 論文組織 13 第二章 背景知識與相關研究探討 14 2.1 媒體碼流之編碼壓縮相關背景知識 14 2.1.1 視訊轉換編碼介紹 15 2.1.2 FFmpeg簡介 16 2.2 雲端運算內容與服務之相關技術 16 2.2.1 雲端運算的基本定義 17 2.2.2 雲端平行處理技術 20 2.2.3 Hadoop 第一代 21 2.2.4 Hadoop Yarn 24 2.2.5 Hadoop Yarn 與Hadoop MapReduce比較 26 2.2.6 Yarn框架之內部構造 28 2.3 排程演算法之相關背景知識 34 2.3.1 背景 34 2.3.2 排程演算法知識 34 2.3.3 Yarn 資源排程器 35 2.3.4 First In First Out (FIFO) 38 2.3.5 Fair Scheduler (FS) 39 2.3.6 Capacity Scheduler (CS) 41 2.4 機器學習介紹 42 2.4.1 機器學習之運作流程 43 2.4.2 機器學習之分類 44 2.4.3 類神經網路 45 第三章 本論文之系統架構設計 46 3.1 系統架構與功能說明 46 3.2 系統運作架構分析 48 3.3 多媒體影音檔案分段之效能評估 51 3.4 任務推測 54 3.4.1 背景 54 3.4.2 指數平滑推測法 55 3.5 排程演算法 57 3.5.1 Yarn 節點標籤管理資源 57 3.5.2 運用類神經網路排序影片複雜度 58 3.5.3 Yarn 任務選擇策略 60 第四章 實驗結果與系統展示 62 4.1 單機轉碼時間 62 4.2 類神經網路之訓練 63 4.3 實驗數據比較 68 4.3.1 各演算法比較 68 4.3.2 切割大小比較 69 4.3.3 資源使用率比較 69 4.3.4 CPU使用率與比較 71 4.3.5 依各節點容量配置container 75 4.3.6 依VM規格配置Label 75 第五章 結論與未來研究探討 77 5.1 結論 77 5.2 未來展望 78 參考文獻 79

    [1] Netflix: https://www.netflix.com/tw/
    [2] J. Dean and S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Comm. of the ACM, vol. 51, pp.107-113, 2008.
    [3] M. Gupta, F. Patwa, J. Benson, and R. Sandhu, “Multi-Layer authorization framework for a representative Hadoop ecosystem deployment,” Proc. of ACM SACMAT (To appear). ACM. 2017
    [4] The Apache Hadoop. http://hadoop.apache.org/
    [5] F. Bellard, and M. Niedermayer, “FFmpeg.” Available from: http://ffmpeg.org/ , 2012.
    [6] T. Wiegand et al, “Overview of the H.264/AVC video coding standard,” IEEE Trans. Circuits Systems Video Technology, vol. 13, no. 7, pp. 560-576, July. 2003.
    [7] The Moving Picture Experts Group. http://mpeg.chiariglione.org/
    [8] International Telecommunication Union. http://www.itu.int/en/pages/default.aspx
    [9] M. Moinard et al, “Prediction of transformed (DCT) video coding residual for video compression,” arXiv preprint arXiv:1404.4181 , 2014.
    [10] NIST(National Institute of Standards and Technology)
    [11] J. Bishop, “4 Cloud Deployment Models,” Personal blog. Retrieved from http://blog.thehigheredcio.com/cloud-deployment-models/, 2012.
    [12] C.-W. Song et al, “Distributed video transcoding based on MapReduce,” Int. Conf. Computer and Information Science (ICIS), pp. 309-314, 4-6, June. 2014.
    [13] Z. Tian et al, “High performance cluster based transcoder,” ICCASM, Taiyuan, pp. V2 48–52, Oct 2010.
    [14] Z. Li et al, “Cloud transcoder: bridging the format and resolution gap between internet videos and mobile devices,” SIGMM of NOSSDAV, June 2012.
    [15] T.-Y. Chen et al, “LaSA: A locality-aware scheduling algorithm for Hadoop-MapReduce resource assignment,” Int. Conf. Collaboration Technologies and Systems (CTS), pp. 342-346, 20-24, May. 2013.
    [16] E. D. Raj et al, “A scalable cloud computing deployment framework for efficient MapReduce operations using Apache YARN,” Int. Conf. Information Communication and Embedded Systems (ICICES), 2014.
    [17] S. Ghemawat et al, “The google file system,” Operating Systems Review (ACM), vol. 37, no. 5, pp. 29-43, Oct. 2003.
    [18] Apache Lucene. http://lucene.apache.org/
    [19] Hadoop, Apache. "Apache Hadoop & YARN." , 2016.
    [20] Deng, Peng et al, “Research on namenode single point of fault solution,” Computer Engineering 38.21 pp. 40-44, 2012.
    [21] Apache Storm http://storm.apache.org/
    [22] Apache Tez https://hortonworks.com/apache/tez/
    [23] J. F. Weets, M. K. Kakhani, and A. Kumar, “Limitations and challenges of HDFS and MapReduce,” IEEE Int. Conf. Green Computing and Internet of Things (ICGCIoT), pp. 545-549, Oct. 2015.
    [24] C.-C. Huang, J.-J. Chen, and Y.-H. Tsai, “A Dynamic and complexity aware cloud scheduling algorithm for video transcoding,” IEEE Int. Workshop Multimedia & Expo (ICMEW), 2016.
    [25] Hadoop MR2 MapReduce Job Submitting Phase https://mapr.com/blog/how-job-execution-framework-mapreduce-v1-v2/
    [26] Resource Manager https://hortonworks.com/blog/apache-hadoop-yarn-resourcemanager/
    [27] Node Manager https://hortonworks.com/blog/apache-hadoop-yarn-nodemanager/
    [28] S. Lee et al, “Efficient core based container deployment algorithm for improving heterogeneous Hadoop YARN performance,” Int. Conf. on Information Science and Applications. Springer, Singapore, 2017.
    [29] Y. Sambe et al, “High speed distributed video transcoding for multiple rates and formats,” IEICE Trans., vol. E88-D, no. 8, pp. 1923–1931, 2005.
    [30] Y.-T. Wang et al, “A round robin with multiple feedback job scheduler in Hadoop,” Int. Conf. Progress in Informatics and Computing (PIC), pp. 471-475, 16-18 May. 2014.
    [31] F. Jokhio et al, “Prediction-based dynamic resource allocation for video transcoding in cloud computing,” Euromicro Int. Conf. Parallel, Distributed and Network-Based Processing (PDP), pp. 254-261, Feb. 27-March. 1, 2013.
    [32] N. S. Naik et al, “A review of adaptive approaches to MapReduce scheduling in heterogeneous environments,” Int. Conf. Advances in Computing, Communications and Informatics (ICACCI), pp. 677-683, 24-27, Sept. 2014.
    [33] A. Rasooli and G. D. Douglas, “A hybrid scheduling approach for scalable heterogeneous hadoop systems,” IEEE High Performance Computing, Networking, Storage and Analysis (SCC), SC Companion, 2012.
    [34] D. Seo et al, “Load distribution algorithm based on transcoding time estimation for distributed transcoding servers,” Int. Conf. Information Science and Applications , pp. 1-8, 21-23, April. 2010.
    [35] C.-C. Sung and J.-J. Chen, “ A Dynamic Cloud Scheduling Method for Video Transcoding based on Neural Network,” NTUST, Jan. 2017
    [36] Y.-T. Ji et al, “Improving multi-job MapReduce scheduling in an opportunistic environment,” Int. Conf. Cloud Computing (CLOUD), pp. 9-16, June. 2013
    [37] Y.-C. Tao et al, “Job scheduling optimization for multi-user MapReduce clusters,” Parallel Architectures, Algorithms and Programming (PAAP), pp. 213-217, 9-11, Dec. 2011.
    [38] X.-Y. Sun et al, “ESAMR: An enhanced self-adaptive MapReduce scheduling algorithm,” Int. Conf. Parallel and Distributed Systems (ICPADS), pp. 148-155, 17-19, Dec. 2012.
    [39] S. Singh and N. Roberts, “Towards SLA-based scheduling on Yarn clusters.” Hadoop Summit, 2015
    [40] YARN/MRv2 Client source code http://dongxicheng.org/mapreduce-nextgen/client-codes/
    [41] GOP Introduction https://documentation.apple.com/en/compressor/usermanual/index.html#chapter=18%26section=5%26tasks=true
    [42] Deshmukh, Shyam, et al, “Avoiding slow running nodes in distributed systems.” In Computer Communication, Networking and Internet Security (pp. 411-420). Springer, Singapore, 2017.
    [43] Q. Liu et al, “Estimation accuracy on execution time of run-time tasks in a heterogeneous distributed environment.” Sensors, 16(9), 1386, 2016
    [44] Exponential Smoothing Method http://wiki.mbalib.com/zh-tw/%E6%8C%87%E6%95%B0%E5%B9%B3%E6%BB%91%E6%B3%95
    [45] W. Tan and M. Bansal, “Node labels in YARN.” Hadoop Summit, 2015
    [46] J.-H. Hsiao and S.-J. Kao, “A usage-aware scheduler for improving MapReduce performance in heterogeneous environments,” Information Science, Electronics and Electrical Engineering (ISEEE), vol. 3, pp. 1648-1652, 26-28, April. 2014.
    [47] VMware. http://www.vmware.com/tw/
    [48] C.-Y. Wu and J.-J. Chen, “Design and implementation of a video transcoding system in cloud computing platforms,” IEEE Int. Conf. Consumer Electronics-Taiwan, ICCE-TW 2015.
    [49] J. Yang and R.-F. Li, “A container resource configuration method in Hadoop Transcoding cluster based on requirements of a sample split.” In IEEE Int. Conf. Cloud Computing and Big Data Analysis (ICCCBDA), pp. 108-112, 2017.

    QR CODE