簡易檢索 / 詳目顯示

研究生: 吳逸韡
YI WEI WU
論文名稱: 運用深度學習法於加速 H.266 幀內編碼
Utilizing Deep Learning methods to Speed up H.266 Intra-Frame Coding
指導教授: 陳建中
Jiann-Jone Chen
口試委員: 郭天穎
TIAN YING GUO
鍾國亮
Kuo-Liang Chung
花凱龍
KAI LONG HUA
吳怡樂
Yi-Leh Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 64
中文關鍵詞: 幀內編碼深度學習視訊編碼
外文關鍵詞: H.266, VTM, VVC
相關次數: 點閱:257下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 5G 通信世代帶來高速且低延遲的高品質傳輸技術,此外隨著電腦運算速度加快,
    視訊編碼標準從高效視訊編碼(HEVC/H.265)的 4K 畫質,提高到通用視訊編碼標準
    (Versatile Video Coding, VVC/H.266)的 8K畫質。在 VVC區塊編碼(coding unit, CU)架構
    下,除了在 HEVC/H.265 中的四分樹分割(Quad Tree, QT)模式,H.266/JVET 又增加二
    分樹分割(Binary Tree, BT),亦即 QTBT 模式,H.266/VVC 增加了三分樹(Ternary Tree,
    TT)。相較於 HEVC,QTBT 在全幀內 (All Intra) 模式下需 523% 的運算量。但相較於
    HEVC,VVC 編碼只需要一半的位元率。為達到如此優秀的編碼效能,VVC 使用了許
    多新的編碼技術,由原本的四分樹切割,改為(QTMT)以支持更靈活的 CU 分割。首先
    是編碼樹單元(CTU)按 QT 劃分,然後 QT 葉節點可以通過多類型樹(MT)結構進一步劃
    分。MT結構中有四種分區類型, 包括垂直 BT(BV),水平 BT(BH),垂直 TT (TV)和水
    平TT (TH),但也因此增加了大量的運算複雜度。在編碼時,H.266/VVC需要花費大量
    時間透過窮舉法,對所有的分割模式找出率失真最佳 (rate-distortion optimization, RDO)
    的 CU 分割模式。針對此一高運算量問題,本論文提出了利用卷積神經網路
    (Convolutional Neural Networks,CNN)來 預 測 畫 面 內 CU 切割模 式 , 藉此 減 少
    H.266/VVC編碼中畫面內編碼的複雜度。我們利用CNN模型Resnet來預測32×32和16
    ×16 的區塊切割模式區塊來降低編碼時間。(1)首先需建立編碼區塊數據集,我們從
    H.266/VVC1.1.0 標準碼流中擷取出 32×32 區塊經 QTMT 程序切割出來的 CU 模式,將
    區塊原始像素值以及 CU 切割模式當作輸入樣本和標示資料來訓練 ResNet;(2) 建立第
    二個數據集,從碼流中擷取 16×16 區塊經由 QTMT 切割出來模式,加上原始區塊資料
    做為 ResNet輸入; (3)將此兩個 ResNet的預測結果結合,做為 H.266/VVC的快速 CU決
    策模式。實驗結果顯示,我們所提出的方法在 BDBR 上升 2.74%的情況下,編碼時間
    可降低 73.49%。


    The fifth generation (5G) communication technique enables high-speed and low-latency data transmission. In addition, with the advance of computer processing speed, video communication standard has improved from 4K image quality of high-efficiency video coding (HEVC / H.265) to the Versatile Video Coding standard (VVC/ H.266). Under the H.266 block coding (CU) framework, in addition to the Quad Tree (QT) partition mode in the HEVC, a H.266/JVET allows Binary Tree (BT) partition, i.e., QTBT mode, and a H.266/VVC allows Ternary Tree (TT) block partition. Compared to the HEVC, the time complexity of the Intracoded QTBT is 523% of that of the HEVC, but the QTBT can reduce 50% of the required bitrates. To achieve the high video coding efficiency, the VVC adopts a QT with nested Multitype Tree (QTMT) partition procedure to encode one CU. It first performs QT block partition for one coding tree unit (CTU) and then MT partition procedure to each sub-block. During the coding process, the MT procedure has to determine whether to adopt vertical BT (BV), horizontal BT (BH), vertical TT (TV) or horizontal TT (TH) to partition the CU through an exhaustive Rate-Distortion Optimization (RDO) procedure which leads to high time complexity. To reduce the time complexity when perform the VVC video encoding, we proposed to utilize a Convolutional Neural Network (CNN) to predict the CU coding mode to eliminate the time consuming QTMT procedure. We adopted a CNN model, Resnet, to perform deep learning based on the block raw data and the block coding mode to construct a model to predict the CU coding mode, in which 32×32 and 16×16 blocks were selected to train the ResNet model. (1) The pixel values and coding mode of 32×32 blocks are extracted from video bitstreams encoded by the H.266/VVC1.1.0 to train the first Resnet; (2) The pixel values and coding modes of 16x16 blocks are extracted to train the second Resnet; (3) Combining the first and the second Resnet, the coding controller can predict the H.266/VVC CU coding mode based on block pixel values such that it can determine to early terminate the coding procedure or bypass some unlikely RDO processes. Experiments showed that the proposed method can reduce 73.79% of encoding time while the BDBR increment is less than 2.74%.

    摘要 1 ABSTRACT 2 目錄 3 圖目錄 5 表目錄 7 第一章 緒論 8 1.1 研究的動機與目的 8 1.2 問題的描術與討論方法 9 1.3 論文組織 10 第二章 知識背景 11 2.1 H.266/VVC視訊編碼標準介紹 11 2.1.1 H.266/VVC制訂&時代背景 11 2.1.2 H.266/VVC與H.265/HEVC的差異 12 2.1.3 H.266/VVC的CU編碼架構 13 2.1.3.1 編碼單位(Coding Unit,CU) 13 2.1.3.2 多類型樹結構的劃分機制 14 2.1.3.3 多類型結構樹的架構 15 2.1.3.4 圖邊的邊界與CU劃分 16 2.1.3.5 CU劃分會造成的冗餘問題 17 2.1.4 幀內預測 18 2.1.4.1 使用67種幀內預測模式進行幀內模式編碼 18 MPM構建和幀內模式編碼 18 非方型的幀內廣角預測(WAIP) 19 根據模式進行的幀內平滑(MDIS) 20 2.1.4.2 跨分量線型模型預測(CCLM) 21 2.1.4.3 位置自是應幀內聯合預測(PDPC) 22 2.1.4.4 多參考線(MRL)的幀內預測 23 2.1.5 幀間預測 24 2.1.5.1 擴展merge預測 24 空間中的候選 25 時域中的候選 26 歷史merge 27 平均merge候選者 28 2.1.5.2 MMVD(Merge mode with MVD) 28 2.1.5.3 仿射運動補償預測 29 2.2 RESNET卷積網路介紹 31 2.3 機器學習運作流程 33 第三章 H.266/VVC編碼單位之快速演算法 34 3.1 H.266/VVC複雜度分析 34 3.2 H.266/FVC之快速CU分割演算法相關文獻 37 3.3 H.265/HEVC之快速CU分割演算法相關文獻 39 3.4 運用卷積神經網路之H.266/VVC快速決策法 41 3.4.1 深度學習CNN卷積神經網路資料庫建立 41 3.4.2 深度學習經典CNN網路ResNet之應用方法 45 第四章 實驗結果與討論 51 4.1 實驗環境設置 51 4.2 RESNET-卷積神經網路與原始VVC之實驗結果比較 53 4.3 本論文方法與HEVC-CNN比較文獻[9]及文獻[22] 56 4.4 本論文方法與H.266/FVC文獻[6]之實驗結果比較 58 第五章 結論與未來研究探討 60 5.1 結論 60 5.2 未來研究討論 61 參考文獻 62

    [1] J. An, H. Huang, K. Zhang. Quadtree plus binary tree structure integration with JEM tools, JVET-B0023, Joint Video Exploration Team (JVET). Feb. 2016.
    [2] R. H. Gweon, Y.-L Lee, and J. Lim. Early termination of CU encoding to reduce HEVC complexity, JVTVC-F045, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC). Jul. 2011.
    [3] J. Kim, S. Jeong, S. Cho, and J. S. Choi, “Adaptive coding unit early termination algorithm for HEVC,” IEEE International Conference on Consumer Electronics (ICCE), 2012.
    [4] X. Shen and L. Yu, “CU splitting early termination based on weighted SVM,” EURASIP Journal on Image and Video Processing, vol. 1, pp. 1, 2013
    [5] Y. Zhang, S. Kwong, X. Wang, H. Yuan, Z. Pan, L. Xu. “Machine Learning-BasedC oding Unit Depth Decisions for Flexible Complexity Allocation in High Efficiency Video Coding,” IEEE Trans. Image Processing, vol. 24, pp. 2225-2238, 2015.
    [6] Z. Wang et al., “Effective quadtree plus binary tree block partition decision for future
    video coding,” Data Compression Conference (DCC), Snowbird, Utah, Apr. 4-7, 2017.
    [7] G. Corrêa et al, “Fast HEVC encoding decisions using data mining,” IEEE Trans. Circuits Syst.Video Technol., vol. 25, no. 4, pp. 660–673, Apr. 2015.
    [8] Y. Zhang et al, “Machine learning-based coding unit depth decisions for flexible complexity allocation in high efficiency video coding,” IEEE Trans. Image Process. vol. 24, no. 7, pp. 2225–2238, Jul. 2015.
    [9] Z. Liu et al, “CU partition mode decision for HEVC hardwired intra encoder using convolution neural network,” IEEE Trans. Image Process., vol. 25, no. 11, pp. 5088– 5103, Nov. 2016.
    [10] T. Mallikarachchi et al, “Content-adaptive feature-based CU size prediction for fast lowdelay video encoding in HEVC,” IEEE Trans. Circuits Syst. Video Technol. vol. 28, no. 3, pp. 693–705, Mar. 2018.
    [11] L. Zhu, Y. Zhang, Z. Pan, R. Wang, S. Kwong, and Z. Peng, “Binary and multi-class learning based low complexity optimization for HEVC encoding,” IEEE Trans. Broadcast., vol. 63, no. 3, pp. 547–561, Sep. 2017.
    [12] Q. Hu et al, “Fast HEVC intra mode decision based on logistic regression classification,” in Proc. IEEE Int. Symp. Broadband Multimedia Syst. Broadcast (BMSB), pp. 1-4, June 2016.
    [13] D. Liu, X. Liu, and Y. Li, “Fast CU size decisions for HEVC intra frame coding based on support vector machines,” in Proc. IEEE 14th Intl Conf. Dependable, Auto. Secure Comput. (DASC), pp. 594–597, Aug. 2016.
    [14] T. Laude and J. Ostermann, “Deep learning-based intra prediction mode decision for HEVC,” in Proc. Picture Coding Symp. (PCS), pp. 1-5, Dec. 2016
    [15] H.-S. Kim and R.-H. Park, “Fast CU partitioning algorithm for HEVC using an onlinelearning-based Bayesian decision rule,” IEEE Trans. Circuits Syst. Video Techn., vol. 26, no. 1, pp. 130-138, Jan. 2016.
    [16] H. R. Tohidypour et al, “Online-learning-based mode prediction method for quality scalable extension of the high efficiency video coding (HEVC) standard,” IEEE Trans. Circuits Syst. Video Technol., vol. 27, no. 10, pp. 2204–2215, Oct. 2017.
    [17] a J. F. de Oliveira and M. S. Alencar, “Online learning early skip decision method for the HEVC inter process using the SVM-based pegasos algorithm,” Electron. Lett., vol. 52, no. 14, pp. 1227–1229, Jul. 2016.
    [18] F. Duanmu, Z. Ma, and Y. Wang, “Fast mode and partition decision using machine learning for intra-frame coding in HEVC screen content coding extension,” IEEE J. Emerg. Sel. Topics Circuits Syst., vol. 6, no. 4, pp. 517–531, Dec. 2016.
    [19] S. Momcilovic, N. Roma, L. Sousa, and I. Milentijevic, “Run-time machine learning for HEVC/H.265 fast partitioning decision,” in Proc. IEEE Int. Symp. Multimedia (ISM), pp. 347–350, Dec. 2015.
    [20] B. Du, W.-C. Siu, and X. Yang, “Fast CU partition strategy for HEVC intra-frame coding using learning approach via random forests,” in Proc. Asia–Pacific Signal Inf. Process.
    Assoc. Annu. Summit Conf. (APSIPA), pp. 1085-1090, Dec. 2015.
    [21] Y. Shan and E.-H. Yang, “Fast HEVC intra coding algorithm based on machine learning and Laplacian transparent composite model,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), pp. 2642–2646, March 2017.
    [22] M. Xu et al., “Reducing complexity of HEVC: A deep learning approach,” IEEE Trans. Image Process., vol. 27, no. 10, pp. 5044–5059, Oct. 2018.
    [23] https://stdshare.itri.org.tw/Content/Files/Event/Files/4.%20FVC%E8%A6%96%E8%A8%8A%E 8%A6%8F%E6%A0%BC%E6%A8%99%E6%BA%96%E5%8C%96%E9%80%B2% E7%A8%8B(%E6%9E%97%E4%BF%8A%E9%9A%86).pdf
    [24] https://kknews.cc/zh-tw/news/nr6l2vg.html.
    [25] Yang, H., Shen, L., Dong, X., Ding, Q., An, P., Jiang, G. “Low complexity CTU partition structure

    QR CODE