研究生: |
盧其均 Chi-chun Lu |
---|---|
論文名稱: |
合多視角視訊深度與灰階資訊對應以改進分散式編碼器輔助資訊品質的方法 Integrate Depth and Gray-level Information of Multi-View Video to Enhance Side Information of Distributed Video Coder |
指導教授: |
陳建中
Jiann-jone Chen |
口試委員: |
鍾國亮
Kuo-Liang Chung 張峰誠 Feng-cheng Chang 陳永昌 Yung-chang Chen 張意政 I-cheng-Chang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2011 |
畢業學年度: | 99 |
語文別: | 中文 |
論文頁數: | 85 |
中文關鍵詞: | 深度資訊 、分散式視訊編碼 、多視角視訊壓縮 |
外文關鍵詞: | MVC, Distributed video coding, Homography |
相關次數: | 點閱:281 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近來隨著視訊編碼與通信技術的進步,多媒體(multimedia) 通信應用視訊不僅只在被動地接收視訊資訊,而是強調能夠呈現在自然場景中的深度感和立體感,能夠提供人眼所見到真實場景的效果,因此三維視訊(3D video)處理技術已成為下世代多媒體的發展主流之一。然而,三維視訊編碼技術如多視角視訊編碼(Multi-view Video Coding, MVC)相較於傳統單一視角視訊畫面影像編碼,資料量更為龐大,運算複雜度極高,實際應用需升級軟硬體。因此使用分散式視訊編碼技術(Distributed Video Coding, DVC)將複雜計算移至解碼端,並結合了多視角視訊結合深度影像(Multi-view Video plus Depth, MVD )系統碼流,能更多利用深度影像關聯性並提供解碼端三維影像重建空間,因此有研究提出MDVDC系統架構(Multi-view Distributed Video plus Depth Coding, MDVDC)。本論文我們根據這個架構,提出改善視訊編碼品質的方法:(1)-提出深度對應亮度圖透視轉換演算法於多視角分散式視訊聯合解碼,利用時間域、視角間和影像深度多維度關聯性達到最好的輔資資訊品質,提升編碼效能;(2)-以不同畫面群結構,驗證深度對應亮度圖透視轉換演算應用於不同輔助資訊產生皆能有效提升影像品質,進而加速渦輪解碼時間。於畫面群為一(GOP=1),Depth Homo-Fusion比其他方法好至0.3~3dB。實驗過程中於畫面群為二(GOP=2),Depth Mapping SIFT-BMP提高較MCTI至少2.3~8.35dB、於ballet影片較SIFT-BMP提升0.1~0.2dB且於渦輪解碼時間最多較MCTI減少約20.05%,維持低於MVME運算複雜度。實驗結果顯示本研究所提方法能有效整合影像灰階與深度資訊之關連性,進一步改進編碼效能。
With the advance of video communication technology, the multimedia platform can not only receives and plays video streaming but also provides depth and stereo information of the natural scene for better visual perception. The 3D technologies play an important role of video innovation. The multi-view video coding (MVC) is an application of 3D video coder. However, the information amount of video data and the required computations for a multi-view system will be very large, as compared to single view videos. The distributed video coding (DVC) efficiently shifts computations to the decoder. The DVC decoder not only exploits inter- and intra view correlations but also utilizes depth information to enhance the quality of reconstructed images. We propose multi-view distributed video plus depth coding (MDVDC) to improve MDVC codec performance: (1) It applies depth and color perspective transform mapping algorithm to joint decoder and well utilizes temporal, interview and depth correlations to yield better side information (SI) images. (2) The proposed MDVDC algorithm can be applied to different GOPs and improve codec performance. It saves turbo decoding time. In comparisons, the Depth Homo-fusion is 0.3~3dB improved as compare to other methods when GOP=1. When GOP=2, the proposed Depth Mapping SIFT-BMP improved 2.3~8.35dB and 0.1~0.2dB as compare to MCTI and SIFT-BMP, respectively. It saves 20.05% decoding time as compares to MCTI.
[1] “Joint draft 8.0 on multi-view video coding,” JVT-AB204, Hannover, Germany, Jul. 2008.
[2] C. Fehn, “Depth-Image-Based Rendering (DIBR), Compression and Transmission for a New Approach on 3D-TV,” Proceedings of SPIE Stereoscopic Displays and Virtual Reality Systems XI, pp. 93–104, San Jose, CA, USA, Jan. 2004.
[3] A. Smolic, K. Muller, K. Dix, P. Merkle, P. Kauff, and T. Wiegand, “Intermediate View Interpolation Based on Multiview Video Plus Depth for Advanced 3D Video Systems,” Proc. ICIP 2008, IEEE International Conference on Image Processing, San Diego, CA, USA, Oct. 2008.
[4] Ying Chen, Weixing Wan, Miska M. Hannuksela, Jun Zhang, Houqiang Li, Moncef Gabbouj “Depth-level-adaptive view synthesis for 3D video,” ICME 2010: 1724-1729
[5] “Vision on 3D Video,” ISO/IEC JTC1/SC29/WG11, MPEG document N10357, Feb. 2009.
[6] P. Kauff et al., “Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability,” Signal Processing: Image Communication, vol. 22, no. 2, pp. 217–234, 2007.
[7] K. Muller et al., ”View synthesis for advanced 3D video systems,” Eurasip Journal on Image and Video Processing, no. 438148, Nov. 2008.
[8] D. Slepian and J. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inf. Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
[9] A. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inf. Theory, vol. 22, no. 1, pp. 1-10, Jan. 1976.
[10] X. Artigas, J. Ascenso, M. Dalai, S. Klomp, D. Kubasov, and M. Ouaret, “The DISCOVER codec: architecture, techniques and evaluation,” in Proc. PCS, Lisbon, Portugal, Nov. 2007.
[11] J. Ascenso, C. Brites, and F. Pereira, “Content adaptive Wyner-Ziv video coding driven by motion activity,” in Proc. IEEE ICIP, pp. 605-608, Oct. 2006.
[12] R. Hartley and A. Zisserman, Multiple view geometry in computer vision, 2nd Ed., Cambridge University Press, ISBN: 0-521-54051-8, 2004.
[13] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, Nov. 2004.
[14] R. C. Bolles and M. A. Fischler, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communication ACM, vol. 24, no. 6, pp. 381-395, Jun. 1981.
[15] X. Artigas, E. Angeli, and L. Torres, “Comparison of different side information generation methods for multiview distributed video coding,” in Proc. SIGMAP, Barcelona, Spain, pp. 450-455, Jul. 2007.
[16] Information technology-coding of audio-visual objects-part 10: advanced video coding, ISO/IEC Std 14496-10, 2003.
[17] D. Marpe, T. Wiegard, and G. J. Sullivan, “The H.264/MPEG4 advanced video coding standard and its applications,” IEEE Communication Magazine, vol. 44, no. 8, pp. 134-143, Aug. 2006.
[18] Iain E. G. Richardson, H.264 and MPEG-4 video compression: video coding for next-generation multimedia, John Wiley & Sons, Ltd. ISBN: 0-470-84837-5, 2003.
[19] D. N. Rowitch and L. B. Milstein, “On the performance of hybrid FEC/ARQ system using rate compatible punctured turbo (RCPT) codes,” IEEE Trans. Communication, vol. 48, no. 6, pp. 948-959, Jun. 2000.
[20] L. R. Bahl, J. Cocke, F. Jelinek, and J. Racic, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory, vol. 20, no. 2, pp. 284-287, Mar. 1974.
[21] G. Berrou, A. Glavieuc, and P. Thitmajshima, “Near Shannon limit error-correcting coding: turbo codes,” in Proc. IEEE ICC, pp. 1064-1070, May 1993.
[22] P. Robertson, E. Villebrn, and P. Hoeher, “A comparison of optimal and sub-optimal MAP decoding algorithms operating in the log domain,” in Proc. IEEE ICC, vol. 2, pp. 1009-1013, Jun. 1995.
[23] 陳慶華,「整合多視角與分散式視訊編碼改良示區塊比對預測演算法」,國立台灣科技大學電機工程系碩士論文,民99
[24] C. Brites and F. Pereira, “Correlation noise modeling for efficient pixel and transform domain Wyner-Ziv video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 9, pp. 1177-1190, Sep. 2008.
[25] D. Kubasov, K. Lajnef, and C. Guillemot, “A hybrid encoder/decoder rate control for Wyner-Ziv video coding with a feedback channel,” IEEE Workshop on MMSP, pp. 251-254, Oct. 2007.
[26] A. Aaron, R. Zhang, and B. Girod, “Wyner-Ziv coding of motion video,” (Invited Paper) in Proc. Asilomar Conference on Signals and Systems, Pacific Grove, CA, Nov. 2002.
[27] R. C. Bolles and M. A. Fischler, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communication ACM, vol. 24, no. 6, pp. 381-395, Jun. 1981.
[28] M. Quaret, F. Dufaux, and T. Ebrahimi, “Mulitiview distributed video coding with encoder driven fusion,” in Proceedings of the European Conference on Signal Processing (EUSIPCO ’07), Poznan, Poland, September 2007.
[29] M. Ouaret, F. Dufaux, and T. Ebrahimi, “Fusion-based multiview distributed video coding,” in Proc. ACM VSSN, pp. 139-144, Oct. 2006.
[30] M. Ouaret, F. Dufaux, and T. Ebrahimi, “Iterative multiview side information for enhanced reconstruction in distributed video coding,” EURASIP Journal on Image Video Process., pp. 1-17, Jan. 2009.
[31] Interactive Visual Media Group at Microsoft Research, http://www.research.
microsoft.com/vision/ImageBasedRealities/3DVideoDownload/
[32] Call for proposals on multi-view video coding, ISO/IEC JTC1/SC29/WG11, N7327, Jul. 2005.
[33] A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” in Proc. IEEE, vol.93, no. 1, pp. 89-110, 2005.
[34] “FreeViewpointTelevision(FTV),”http://www.tanimoto.nuee.nagoya-u.ac.jp/study/FTV
[35] The official website of OpenCV: http://opencv.willowgarage.com
[36] The official website of IT++: http://sourceforge.net/apps/wordpress/itpp/