研究生: |
林哲宇 Che-Yu Lin |
---|---|
論文名稱: |
運用多層影像深度資訊於改進視訊編碼效能 Improve video coding performance by using multi-layer depth image information |
指導教授: |
陳建中
Jiann-Jone Chen |
口試委員: |
吳怡樂
Yi-Leh Wu 金台齡 Tai-Lin Chin |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2016 |
畢業學年度: | 104 |
語文別: | 中文 |
論文頁數: | 81 |
中文關鍵詞: | 多層深度影像編碼 、3D影像編碼 、虛擬視角合成 、3D-HEVC 、深度/彩度影像分割 |
外文關鍵詞: | three-dimensional video coding, multi-layer depth-image coding, virtual view synthesis, 3D-HEVC, depth/color image segmentation |
相關次數: | 點閱:312 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著近年來硬體技術及視訊編碼技術之進步,視訊解析度由原本的480P到現在手機都支援播放的4K影片,視訊資料量大幅度的增加,因此視訊壓縮成為相當重要的課題。而自由視角視訊(Free-View Video, FVV)與立體視訊(3-D Video)也越來越受到重視,隨處可見多視角的應用,例如可供3D影像拍攝的手機或是虛擬實境的頭戴式裝置等。針對多視角視訊,若只運用紋理資訊(Texture Information)要達到較佳的自由視角收視體驗,需耗費大量的軟硬體的計算負載。為了降低視訊的傳送成本及提高合成視角之畫面品質,多視角視訊導入了深度影像的資訊(Depth Information),其具備左右視角之紋理資訊與同步紀錄的深度資訊,藉由運用深度與彩度資訊間的信號關聯性,可大大減少合成中間視角所需要的軟硬體成本。雖然套用現有紋理影像之編碼方式可以編碼深度影像,但由於深度影像之特性與紋理影像略有不同,壓縮效果並沒這麼顯著,故深度影像的壓縮及編碼方式也成為了重要的研究項目。本論文研究運用深度影像之特性,提出一個基於多階層式的深度影像編碼方式,搭配鏈碼與區域分割資訊以達到降低整體編碼運算負擔的效果,實驗結果顯示本論文所提出的方法,相較於系統預設方法,可大幅降低編碼時間,而且可進一步運用深度資訊提升彩度資訊的編碼品質。整體可提昇多視角視訊的編碼效能。
With the advance of the multimedia codec technologies and related application devices, the cost to transmit media data can be reduced while the system can still maintain high quality reconstructed video. In additional to the conventional 2-D multimedia applications, the Free-View Video (FVV) and 3D Video become popular in that better user perception experiences can be provided. The FVV and 3D-Video related applications have been wildly developed and utilized to drive more user interests. The FVV and 3-D Video are acquired by utilizing multi cameras from different view angle of the same scene. To reduce the costs of multi-view data transmission and enhance the quality of the reconstructed virtual views, not only the texture but also the depth information have to be well manipulated, as the latter plays an important role in virtual view synthesis. In this thesis, we propose a new depth map coding scheme to improve the inefficiency of applying conventional texture information based video coding tools on video depth map. Based on the fact that the depth map comprises sharp edges that surround object region boundaries and all depth map values belonging to the same region are nearly constant, the conventional 2D video codec which are designed to well compress video texture information is proved to be inefficient to encode depth map information. By utilizing the high correlation between texture and depth map, we propose a texture segmented region-based depth coding scheme to reserve accurate depth information while saving required bit rates. We proposed to utilize a two-layer depth image coding method, together with chain code and region segmentation results, to reduce the total FVV coding complexity. Experimental results show that the proposed method can largely shorten the encoding time and the decoded depth information can be used to improve the texture image coding performance. It helps to improve the overall FVV encoding performance.
[1] http://www.meducationalliance.org/sites/default/files/ericsson_mobility_report_2013.pdf
[2] https://www.youtube.com/yt/press/zh-TW/statistics.html
[3] “Call for Comments on 3DAV”, ISO/IEC JTC1/SC29/WG11, N6051, October 2003..
[4] M. Tanimoto, “Overview of Free Viewpoint Television,” Signal Processing: Image Communication, vol. 21, no. 6, pp. 454-461, July 2006.
[5] M. Tanimoto, “Free Viewpoint Television - FTV”, Picture Coding Symposium 2004, Session 5, December 2004.
[6] Information technology-coding of audio-visual objects-part 10: advanced video coding, ISO/IEC Std 14496-10, 2003.
[7] ITU-T and ISO/IEC JTC1, “Joint draft 8.0 on multi-view video coding,” JVT-AB204, July 2008.
[8] G. J. Sullivan, “Standardized extensions of high efficiency video coding (HEVC),” IEEE Journal of Selected Topics in Signal Processing, vol. 7, no. 6, p.p. 1001-1016, 2013
[9] P. Merkle et al., “Multi-view video plus depth representation and coding,” in ICIP, Oct 2007
[10] P. Merkle, C. Bartnik, K. Muller, D. Marpe, and T. Wiegand, “3D Video: Depth Coding Based on Inter-component Prediction of Block Partitions,” Proc. PCS 2012, Picture Coding Symposium, Krakow, Poland, May 2012.
[11] I. Tabus et al., “Context coding of depth map images under the piecewise-constant image model representation,” Image Processing, vol. 22, issue 11, pp. 4195-4210, Nov. 2013.
[12] B. Enjarini et al., “Planar segmentation from depth images using gradient of depth feature,” Intelligent Robots and Systems, pp. 4668-4674, Oct. 2012.
[13] J. Hanca et al., “Segmentation-based intra coding of depth maps using texture information,” Digital Signal Processing, pp. 1-6, July 2013.
[14] M. Maceira et al., “Fusion of colour and depth partitions for depth map coding,” Digital Signal Processing, pp. 1-7, July 2013.
[15] R. W. G. Hunt, The Reproduction of Colour in Photography, Printing & Television, 5th Ed. Fountain Press, England, 1995. ISBN 0863433812.
[16] L. Cheng-Hao; C. Jiann-Jone. Depth Map Coding Scheme Utilizing Depth Information-Introduced Texture Segmentation. 2014.
[17] M. Tanimoto, “Overview of FTV (Free-Viewpoint Television) ”, ICME, pp 1552-1553, June 2009.
[18] P. Kauff et al., “Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability,” Signal Processing: Image Communication, vol. 22, no. 2, pp. 217-234, 2007.
[19] M. Gotfryd, K. Wegner, and M. Domański, “View synthesis software and assessment of its performance,” ISO/IEC JTC1/SC29/WG11, MPEG 2008/M15672, Hannover, Germany, July 2008.
[20] K. Muller et al., “View synthesis for advanced 3D video systems,” Eurasip Journal on Image and Video Processing, no. 438148, Nov. 2008.
[21] FEHN, Christoph, “A 3D-TV approach using depth-image-based rendering (DIBR),” In: Proc. of VIIP. 2003.
[22] M. Z. Brown, D. Burschka, and G. D. Hager, “Advances in computational stereo,” TPAMI, 25(8):993-1008, 2003.
[23] Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
[24] K. Muller, P. Merkle, “Challenges in 3D video standardization,” Visual Communications and Image Processing, 2011.
[25] P. Salembier et al., “Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval,” Image Processing, pp. 567-576, Apr, 2000.
[26] P. Felzenszwalb and D. Huttenlocher, “Efficient graph-based image segmentation,” Int. J. Computer Vision, 59(2):167–181, 2004
[27] Yu-Hsiang Wang, “Tutorial: Image Segmentation,” Institute of Communication Engineering National Taiwan University, Taipei, Taiwan, ROC.
[28] L. Vincent and P. Soille, “Watersheds in digital space: an efficient algorithm based on immersion simulation,” IEEE Trans. Pattern Analysis and machine Intelligence, 13(6), pp. 583-598, 1991
[29] J. M. Gauch, “Image segmentation and analysis via multiscale gradient watershed hierarchies,” IEEE Trans. Pattern Analysis and machine Intelligence, 8(1), pp. 69-79, 1999
[30] L. Garrido et al., “Extensive operators in partition lattices for image sequence analysis,” Signal Processing, 1998.
[31] M. Gotfryd, K. Wegner, and M. Domański, “View synthesis software and assessment of its performance,” ISO/IEC JTC1/SC29/WG11, MPEG 2008/M15672, Hannover, Germany, July 2008.
[32] F. M. Candocia and J. C. Principe, “Super-resolution of images based on local correlations,” IEEE Transactions on Neural Network, vol. 10, no. 2, pp. 372–380, 1999.
[33] C. C. Wei and C. H. Chen, “Generalized bilinear interpolation of motion vectors for quad-tree mesh,” Int. Conf. Intelligent Info. Hiding Multimedia Signal Process., pp. 635-638 Aug. 2008.
[34] Z. Shi, S. Yao, and Y. Zhao “A novel video image scaling algorithm based on morphological edge interpolation,” Int. Conf. Neural Networks & Signal Processing, 7-11, pp. 388-391, 2015
[35] P. Ausbeck, Jr., “The piecewise-constant image model,” Proc. IEEE, vol. 88, no. 11, pp. 1779–1789, Nov. 2000.
[36] N. Zhang, S. Ma and W. Gao (2009) “Shape-based depth map coding,” IEEE The Fifth Int. Conf. Intelligent Information Hiding and Multimedia Signal Processing, 2009
[37] J. Gautier, O. Le Meur, and C. Guillemot, “Efficient depth map compression based on lossless edge coding and diffusion,” in Picture Coding Symposium (PCS), 2012, pp. 81–84.
[38] C. De Raffaele et al., “Efficient multiview depth epresentation based on image segmentation,” IEEE Picture Coding Symposium, pp. 65-68, Krakow, Poland, May 2012.
[39] The official website of SharpZipLib: http://icsharpcode.github.io/SharpZipLib/
Call for proposals on multi-view video coding, ISO/IEC JTC1/SC29/WG11, N7327, Jul. 2005.