簡易檢索 / 詳目顯示

研究生: 蘇建安
Jian-An Su
論文名稱: 運用深度學習光流於區塊切割模式繼承以加速H.266/VVC幀內編碼程序
Utilizing Deep-Learned Optical Flow for CU Partition Mode Inheritance to speed up H.266/VVC Intra-Frame Coding
指導教授: 陳建中
Jiann-Jone Chen
口試委員: 杭學鳴
Hsueh-Ming Hang
郭天穎
Tien-Ying Kuo
鍾國亮
Kuo-Liang Chung
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 45
中文關鍵詞: 多功能影像編碼幀內編碼程序多層級相似度結構深度學習光流
外文關鍵詞: H.266/VVC, Intra-Frame Coding, MS-SSIM, Deep-Learned Optical Flow
相關次數: 點閱:291下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

隨著網路世代的進步,網路串流應用逐漸增加,如遠端桌面、線上直播以及雲端遊戲,這些應用皆需要互動的功能,為了提供良好的使用者體驗,即時操作的回饋(Real-time Feedback) 與影像品質的控制(Image Quality Control) 相當重要。本研究基於最新一代多功能視訊編碼H.266/VVC 標準架構,提出一種用於幀內編碼(Intra-Frame Coding) 之加速方法。本文設計區塊切割模式繼承演算法,運用多層級相似度結構(MS-SSIM) 來量測區塊相似性,運用深度學習光流(Deep-Learned Optical Flow) 來篩選高動量區塊,將區塊變動較小的前幀切割模式繼承至後幀,以節省窮舉演算的時間。實驗結果顯示,平均繼承區塊比例(Use Rate) 為61.32%,相較於H.266預設編碼方式(VTM 11.0),本方法節省42.34% 的編碼時間(Time Saving),BDBR 提升1.49%。


With the advance in digital technologies, such as high-speed networking, media compression technologies and higher computer processing power, multimedia streaming applications, such as remote desktop, online streaming, and cloud gaming, can provide interactive operation services. To yield better user media consuming experience, a multimedia processing platform has to provide real-time feedback capability and image quality control functions. In this research, we study how to perform fast intra-frame coding based on the H.266/VVC coding framework. To fast encode one intra-coding frame, we proposed to detect blocks reside on static background regions and inherit the block split mode from its previous intra-coded frame to save processing time. A block-based MS-SSIM method is utilized to measure similarities of co-located blocks between the current and the previous frames to roughly find out static region blocks. In addition, we utilized a deep-learned optical flow model to quickly estimate pixel-wise motion activity to screen out high motion blocks adjacent to non-static blocks. By using split mode inheritance for these static blocks, the coding controller can skip rate-distortion optimization operations that require an exhaustive search process to reduce the encoding time complexity. Experiments showed that the proposed method can help to save 42.34% of encoding time with 1.49% of BDBR increment, as compared with the VTM 11.0 intra-coding with default settings. The percentage of static region blocks is found to be 61.32% in average from test video sequences.

摘要 i ABSTRACT ii 誌謝 iii 第一章 緒論 1 1.1 研究動機及目的 1 1.2 問題描述及研究辦法 1 1.3 論文組織 2 第二章 背景知識 3 2.1 H.266/Versatile Video Coding 3 2.1.1 H.266 介紹 3 2.1.2 H.266 編碼單元 4 2.1.3 H.266 區塊切割模式 4 2.1.4 H.266 編碼樹 5 2.1.5 H.266 複雜度分析 5 2.1.6 H.266 快速演算 6 2.1.6.1 語法限制 6 2.1.6.2 冗餘限制 6 2.2 Convolutional Neural Network 7 2.2.1 Convolution 卷積 7 2.2.2 CNN 卷積神經網路 8 2.3 Optical Flow 8 2.3.1 Optical Flow 光流 8 2.3.2 Deep-Learned Optical Flow 深度學習光流 9 2.4 Structural Similarity Index 10 2.4.1 SSIM 10 2.4.2 MS-SSIM 10 第三章 區塊切割模式預測方法 12 3.1 區塊切割模式方法 12 3.1.1 預測切割深度 12 3.1.2 預測切割模式 12 3.2 相關工作 13 3.2.1 以統計進行預測 14 3.2.2 以CNN 進行預測 14 3.2.3 以RNN 進行預測 14 第四章 區塊切割模式繼承演算法 15 4.1 繼承演算法 15 4.2 技術細節 16 4.2.1 資料暫存空間 16 4.2.2 相似度檢測 16 4.2.3 相似度檢測效果評估 17 4.2.4 光流動態檢測 18 4.2.5 光流動態檢測效果評估 20 4.2.6 區塊切割繼承 21 4.2.7 演算法流程 21 第五章 實驗結果與討論 23 5.1 實驗環境配置 23 5.1.1 環境配置 23 5.1.2 評量指標 23 5.2 實驗結果 25 5.2.1 與H.266 原始編碼比較 25 5.2.2 從實驗數據中探討 25 5.2.2.1 區塊簡單加速效果有限 25 5.2.2.2 區塊繼承切割模式非最佳解 26 5.2.3 與切割編碼相關工作比較 28 第六章 結論與未來研究討論 30 6.1 結論 30 6.2 未來研究討論 30

[1] S. H. K. Jianle Chen, Yan Ye, “Algorithm description for versatile video coding and test model 5 (vtm 5).”
[2] F. Pakdaman, M. A. Adelimanesh, M. Gabbouj, and M. R. Hashemi, “Complexity analysis of next-generation vvc encoding and decoding,” in 2020 IEEE International Conference on Image Processing (ICIP), pp. 3134–3138, 2020.
[3] A. Ranjan and M. J. Black, “Optical flow estimation using a spatial pyramid network,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2720–2729, 2017.
[4] Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” in The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, vol. 2, pp. 1398–1402 Vol.2, 2003.
[5] M. Xu, T. Li, Z. Wang, X. Deng, R. Yang, and Z. Guan, “Reducing complexity of hevc: A deep learning approach,” IEEE Transactions on Image Processing, vol. 27, no. 10, pp. 5044–5059, 2018.
[6] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems (F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, eds.), vol. 25, pp. 1097– 1105, Curran Associates, Inc., 2012.
[8] A. Dosovitskiy, P. Fischer, E. Ilg, P. Häusser, C. Hazirbas, V. Golkov, P. v. d. Smagt, D. Cremers, and T. Brox, “Flownet: Learning optical flow with convolutional networks,” in 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2758– 2766, 2015.
[9] Z. Wang, S. Wang, J. Zhang, S. Wang, and S. Ma, “Effective quadtree plus binary tree block partition decision for future video coding,” in 2017 Data Compression Conference (DCC), pp. 23–32, 2017.
[10] H. Yang, L. Shen, X. Dong, Q. Ding, P. An, and G. Jiang, “Low-complexity ctu partition structure decision and fast intra mode decision for versatile video coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 6, pp. 1668–1682, 2020.
[11] T. Fu, H. Zhang, F. Mu, and H. Chen, “Fast cu partitioning algorithm for h.266/vvc intra-frame coding,” in 2019 IEEE International Conference on Multimedia and Expo (ICME), pp. 55–60, 2019.
[12] T. Li, M. Xu, R. Tang, Y. Chen, and Q. Xing, “Deepqtmt: A deep learning approach for fast qtmt-based cu partition of intra-mode vvc,” 2020.
[13] F. Bossen, “Common hm test conditions and software reference configurations,” 2013.
[14] G. Bjontegaard, “Calculation of average psnr differences between rd-curves,” 2001.
[15] J. J. Stéphane Pateux, “An excel add-in for computing bjontegaard metric and its evolution,” 2007.
[16] G. Tang, M. Jing, X. Zeng, and Y. Fan, “Adaptive cu split decision with poolingvariable cnn for vvc intra encoding,” in 2019 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4, 2019.

QR CODE