簡易檢索 / 詳目顯示

研究生: 林哲宇
Che-Yu Lin
論文名稱: 運用多層影像深度資訊於改進視訊編碼效能
Improve video coding performance by using multi-layer depth image information
指導教授: 陳建中
Jiann-Jone Chen
口試委員: 吳怡樂
Yi-Leh Wu
金台齡
Tai-Lin Chin
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 81
中文關鍵詞: 多層深度影像編碼3D影像編碼虛擬視角合成3D-HEVC深度/彩度影像分割
外文關鍵詞: three-dimensional video coding, multi-layer depth-image coding, virtual view synthesis, 3D-HEVC, depth/color image segmentation
相關次數: 點閱:312下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著近年來硬體技術及視訊編碼技術之進步,視訊解析度由原本的480P到現在手機都支援播放的4K影片,視訊資料量大幅度的增加,因此視訊壓縮成為相當重要的課題。而自由視角視訊(Free-View Video, FVV)與立體視訊(3-D Video)也越來越受到重視,隨處可見多視角的應用,例如可供3D影像拍攝的手機或是虛擬實境的頭戴式裝置等。針對多視角視訊,若只運用紋理資訊(Texture Information)要達到較佳的自由視角收視體驗,需耗費大量的軟硬體的計算負載。為了降低視訊的傳送成本及提高合成視角之畫面品質,多視角視訊導入了深度影像的資訊(Depth Information),其具備左右視角之紋理資訊與同步紀錄的深度資訊,藉由運用深度與彩度資訊間的信號關聯性,可大大減少合成中間視角所需要的軟硬體成本。雖然套用現有紋理影像之編碼方式可以編碼深度影像,但由於深度影像之特性與紋理影像略有不同,壓縮效果並沒這麼顯著,故深度影像的壓縮及編碼方式也成為了重要的研究項目。本論文研究運用深度影像之特性,提出一個基於多階層式的深度影像編碼方式,搭配鏈碼與區域分割資訊以達到降低整體編碼運算負擔的效果,實驗結果顯示本論文所提出的方法,相較於系統預設方法,可大幅降低編碼時間,而且可進一步運用深度資訊提升彩度資訊的編碼品質。整體可提昇多視角視訊的編碼效能。


    With the advance of the multimedia codec technologies and related application devices, the cost to transmit media data can be reduced while the system can still maintain high quality reconstructed video. In additional to the conventional 2-D multimedia applications, the Free-View Video (FVV) and 3D Video become popular in that better user perception experiences can be provided. The FVV and 3D-Video related applications have been wildly developed and utilized to drive more user interests. The FVV and 3-D Video are acquired by utilizing multi cameras from different view angle of the same scene. To reduce the costs of multi-view data transmission and enhance the quality of the reconstructed virtual views, not only the texture but also the depth information have to be well manipulated, as the latter plays an important role in virtual view synthesis. In this thesis, we propose a new depth map coding scheme to improve the inefficiency of applying conventional texture information based video coding tools on video depth map. Based on the fact that the depth map comprises sharp edges that surround object region boundaries and all depth map values belonging to the same region are nearly constant, the conventional 2D video codec which are designed to well compress video texture information is proved to be inefficient to encode depth map information. By utilizing the high correlation between texture and depth map, we propose a texture segmented region-based depth coding scheme to reserve accurate depth information while saving required bit rates. We proposed to utilize a two-layer depth image coding method, together with chain code and region segmentation results, to reduce the total FVV coding complexity. Experimental results show that the proposed method can largely shorten the encoding time and the decoded depth information can be used to improve the texture image coding performance. It helps to improve the overall FVV encoding performance.

    摘要 I ABSTRACT II 致謝 III 圖目錄 VI 表目錄 VIII 第一章 緒論 1 1.1 研究動機與目的 1 1.2 問題描述與研究方法 2 1.3 論文組織 3 第二章 背景知識與相關研究 4 2.1 立體視覺與自由視角視訊 4 2.1.1立體視覺視訊、原理與架構 4 2.1.2自由視角視訊原理與架構 5 2.1.3虛擬視角合成 7 2.2 影像處理相關方法與基礎知識 8 2.2.1色彩空間簡介與轉換原理 8 2.2.2影像區域的分割 9 2.2.3二元樹之影像區域劃分方法 11 2.2.4影像空間可調性之介紹 14 2.3 現有相關深度資訊編碼架構 17 2.3.1區塊為基礎的編碼方式-3-D HEVC Extension深度資訊編碼 17 2.3.2區域為基礎的編碼方式 20 2.3.3以彩度邊界輔助深度資訊編碼的方式 22 第三章 運用多層影像深度資訊於改進視訊編碼效能 25 3.1 系統架構 25 3.2 空間可調性及殘餘值編碼 27 3.3 深度不連續邊界之編碼介紹 32 3.3.1深度資訊之不連續處之定義及標記方法 32 3.3.2不連續邊界資訊之分析與資料壓縮方法 35 3.4 導入深度邊界之紋理區域劃分方法 38 3.4.1純紋理影像區域劃分結果應用於深度資訊編碼之問題描述 38 3.4.2引入深度不連續邊界之影像區域模型 39 3.5 深度相似區域合併 43 3.6 深度區域數值編碼方法 46 第四章 實驗結果與數據分析 49 4.1實驗環境 49 4.2 深度影像編碼之效能比較 51 4.2.1壓縮效率 51 4.2.2深度影像重建之結果 56 4.2.3編碼時間比較 61 4.3 虛擬視角合成 64 第五章 結論 70 5.1結論 70 5.2未來展望 70

    [1] http://www.meducationalliance.org/sites/default/files/ericsson_mobility_report_2013.pdf
    [2] https://www.youtube.com/yt/press/zh-TW/statistics.html
    [3] “Call for Comments on 3DAV”, ISO/IEC JTC1/SC29/WG11, N6051, October 2003..
    [4] M. Tanimoto, “Overview of Free Viewpoint Television,” Signal Processing: Image Communication, vol. 21, no. 6, pp. 454-461, July 2006.
    [5] M. Tanimoto, “Free Viewpoint Television - FTV”, Picture Coding Symposium 2004, Session 5, December 2004.
    [6] Information technology-coding of audio-visual objects-part 10: advanced video coding, ISO/IEC Std 14496-10, 2003.
    [7] ITU-T and ISO/IEC JTC1, “Joint draft 8.0 on multi-view video coding,” JVT-AB204, July 2008.
    [8] G. J. Sullivan, “Standardized extensions of high efficiency video coding (HEVC),” IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 6, p.p. 1001-1016, 2013
    [9] P. Merkle et al., “Multi-view video plus depth representation and coding,” in ICIP, Oct 2007
    [10] P. Merkle, C. Bartnik, K. Muller, D. Marpe, and T. Wiegand, “3D Video: Depth Coding Based on Inter-component Prediction of Block Partitions,” Proc. PCS 2012, Picture Coding Symposium, Krakow, Poland, May 2012.
    [11] I. Tabus et al., “Context coding of depth map images under the piecewise-constant image model representation,” Image Processing, vol. 22, issue 11, pp. 4195-4210, Nov. 2013.
    [12] B. Enjarini et al., “Planar segmentation from depth images using gradient of depth feature,” Intelligent Robots and Systems, pp. 4668-4674, Oct. 2012.
    [13] J. Hanca et al., “Segmentation-based intra coding of depth maps using texture information,” Digital Signal Processing, pp. 1-6, July 2013.
    [14] M. Maceira et al., “Fusion of colour and depth partitions for depth map coding,” Digital Signal Processing, pp. 1-7, July 2013.
    [15] R. W. G. Hunt, The Reproduction of Colour in Photography, Printing & Television, 5th Ed. Fountain Press, England, 1995. ISBN 0863433812.
    [16] L. Cheng-Hao; C. Jiann-Jone. Depth Map Coding Scheme Utilizing Depth Information-Introduced Texture Segmentation. 2014.
    [17] M. Tanimoto, “Overview of FTV (Free-Viewpoint Television) ”, ICME, pp 1552-1553, June 2009.
    [18] P. Kauff et al., “Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability,” Signal Processing: Image Communication, vol. 22, no. 2, pp. 217-234, 2007.
    [19] M. Gotfryd, K. Wegner, and M. Domański, “View synthesis software and assessment of its performance,” ISO/IEC JTC1/SC29/WG11, MPEG 2008/M15672, Hannover, Germany, July 2008.
    [20] K. Muller et al., “View synthesis for advanced 3D video systems,” Eurasip Journal on Image and Video Processing, no. 438148, Nov. 2008.

    [21] FEHN, Christoph, “A 3D-TV approach using depth-image-based rendering (DIBR),” In: Proc. of VIIP. 2003.
    [22] M. Z. Brown, D. Burschka, and G. D. Hager, “Advances in computational stereo,” TPAMI, 25(8):993-1008, 2003.
    [23] Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
    [24] K. Muller, P. Merkle, “Challenges in 3D video standardization,” Visual Communications and Image Processing, 2011.
    [25] P. Salembier et al., “Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval,” Image Processing, pp. 567-576, Apr, 2000.
    [26] P. Felzenszwalb and D. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Computer Vision, 59(2):167–181, 2004
    [27] Yu-Hsiang Wang, “Tutorial: Image Segmentation,” Institute of Communication Engineering National Taiwan University, Taipei, Taiwan, ROC.
    [28] L. Vincent and P. Soille, “Watersheds in digital space: an efficient algorithm based on immersion simulation,” IEEE Trans. Pattern Analysis and machine Intelligence, 13(6), pp. 583-598, 1991
    [29] J. M. Gauch, “Image segmentation and analysis via multiscale gradient watershed hierarchies,” IEEE Trans. Pattern Analysis and machine Intelligence, 8(1), pp. 69-79, 1999
    [30] L. Garrido et al., “Extensive operators in partition lattices for image sequence analysis,” Signal Processing, 1998.
    [31] M. Gotfryd, K. Wegner, and M. Domański, “View synthesis software and assessment of its performance,” ISO/IEC JTC1/SC29/WG11, MPEG 2008/M15672, Hannover, Germany, July 2008.
    [32] F. M. Candocia and J. C. Principe, “Super-resolution of images based on local correlations,” IEEE Transactions on Neural Network, vol. 10, no. 2, pp. 372–380, 1999.
    [33] C. C. Wei and C. H. Chen, “Generalized bilinear interpolation of motion vectors for quad-tree mesh,” Int. Conf. Intelligent Info. Hiding Multimedia Signal Process., pp. 635-638 Aug. 2008.
    [34] Z. Shi, S. Yao, and Y. Zhao “A novel video image scaling algorithm based on morphological edge interpolation,” Int. Conf. Neural Networks & Signal Processing, 7-11, pp. 388-391, 2015
    [35] P. Ausbeck, Jr., “The piecewise-constant image model,” Proc. IEEE, vol. 88, no. 11, pp. 1779–1789, Nov. 2000.
    [36] N. Zhang, S. Ma and W. Gao (2009) “Shape-based depth map coding,” IEEE The Fifth Int. Conf. Intelligent Information Hiding and Multimedia Signal Processing, 2009
    [37] J. Gautier, O. Le Meur, and C. Guillemot, “Efficient depth map compression based on lossless edge coding and diffusion,” in Picture Coding Symposium (PCS), 2012, pp. 81–84.
    [38] C. De Raffaele et al., “Efficient multiview depth epresentation based on image segmentation,” IEEE Picture Coding Symposium, pp. 65-68, Krakow, Poland, May 2012.
    [39] The official website of SharpZipLib: http://icsharpcode.github.io/SharpZipLib/
    Call for proposals on multi-view video coding, ISO/IEC JTC1/SC29/WG11, N7327, Jul. 2005.

    QR CODE