簡易檢索 / 詳目顯示

研究生: 盧其均
Chi-chun Lu
論文名稱: 合多視角視訊深度與灰階資訊對應以改進分散式編碼器輔助資訊品質的方法
Integrate Depth and Gray-level Information of Multi-View Video to Enhance Side Information of Distributed Video Coder
指導教授: 陳建中
Jiann-jone Chen
口試委員: 鍾國亮
Kuo-Liang Chung
張峰誠
Feng-cheng Chang
陳永昌
Yung-chang Chen
張意政
I-cheng-Chang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 85
中文關鍵詞: 深度資訊分散式視訊編碼多視角視訊壓縮
外文關鍵詞: MVC, Distributed video coding, Homography
相關次數: 點閱:281下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近來隨著視訊編碼與通信技術的進步,多媒體(multimedia) 通信應用視訊不僅只在被動地接收視訊資訊,而是強調能夠呈現在自然場景中的深度感和立體感,能夠提供人眼所見到真實場景的效果,因此三維視訊(3D video)處理技術已成為下世代多媒體的發展主流之一。然而,三維視訊編碼技術如多視角視訊編碼(Multi-view Video Coding, MVC)相較於傳統單一視角視訊畫面影像編碼,資料量更為龐大,運算複雜度極高,實際應用需升級軟硬體。因此使用分散式視訊編碼技術(Distributed Video Coding, DVC)將複雜計算移至解碼端,並結合了多視角視訊結合深度影像(Multi-view Video plus Depth, MVD )系統碼流,能更多利用深度影像關聯性並提供解碼端三維影像重建空間,因此有研究提出MDVDC系統架構(Multi-view Distributed Video plus Depth Coding, MDVDC)。本論文我們根據這個架構,提出改善視訊編碼品質的方法:(1)-提出深度對應亮度圖透視轉換演算法於多視角分散式視訊聯合解碼,利用時間域、視角間和影像深度多維度關聯性達到最好的輔資資訊品質,提升編碼效能;(2)-以不同畫面群結構,驗證深度對應亮度圖透視轉換演算應用於不同輔助資訊產生皆能有效提升影像品質,進而加速渦輪解碼時間。於畫面群為一(GOP=1),Depth Homo-Fusion比其他方法好至0.3~3dB。實驗過程中於畫面群為二(GOP=2),Depth Mapping SIFT-BMP提高較MCTI至少2.3~8.35dB、於ballet影片較SIFT-BMP提升0.1~0.2dB且於渦輪解碼時間最多較MCTI減少約20.05%,維持低於MVME運算複雜度。實驗結果顯示本研究所提方法能有效整合影像灰階與深度資訊之關連性,進一步改進編碼效能。


    With the advance of video communication technology, the multimedia platform can not only receives and plays video streaming but also provides depth and stereo information of the natural scene for better visual perception. The 3D technologies play an important role of video innovation. The multi-view video coding (MVC) is an application of 3D video coder. However, the information amount of video data and the required computations for a multi-view system will be very large, as compared to single view videos. The distributed video coding (DVC) efficiently shifts computations to the decoder. The DVC decoder not only exploits inter- and intra view correlations but also utilizes depth information to enhance the quality of reconstructed images. We propose multi-view distributed video plus depth coding (MDVDC) to improve MDVC codec performance: (1) It applies depth and color perspective transform mapping algorithm to joint decoder and well utilizes temporal, interview and depth correlations to yield better side information (SI) images. (2) The proposed MDVDC algorithm can be applied to different GOPs and improve codec performance. It saves turbo decoding time. In comparisons, the Depth Homo-fusion is 0.3~3dB improved as compare to other methods when GOP=1. When GOP=2, the proposed Depth Mapping SIFT-BMP improved 2.3~8.35dB and 0.1~0.2dB as compare to MCTI and SIFT-BMP, respectively. It saves 20.05% decoding time as compares to MCTI.

    摘要 I ABSTRACT II 目錄 III 圖目錄 VI 表目錄 X 第一章 緒論 1 1.1 研究動機與目的 1 1.2 問題描述及研究方法 2 1.3 論文組織 4 第二章 背景知識與相關研究 5 2.1 分散式視訊編碼之多視角視訊結合深度編碼 5 2.1.1 多視角視訊編碼 5 2.1.2 多視角視訊結合深度影像 6 2.1.3 分散式視訊編碼 8 2.2 多視角關聯性之輔助資訊重建 11 2.2.1 運動補償時間域內插 11 2.2.2 視角間透視轉換模型 12 2.2.3 混合式多視角運動估測 14 2.3 相關模擬工具 15 2.3.1 H.264視訊編碼器 16 2.3.2 RCPT之渦輪編碼器 17 2.3.3 SIFT特徵搜尋工具 22 第三章 分散式多視角視訊結合深度資訊編碼 26 3.1 MDVDC 系統 26 3.2 深度影像對應透視轉換 32 3.2.1 計算深度轉換矩陣 33 3.2.2 深度影像透視轉換 35 3.2.3 深度對應亮度影像 36 3.2.4 像素間內插補洞 36 3.2.5 更正輪廓、黑邊效應 37 3.3 輔助資訊之深度轉換區塊合成演算法 39 3.3.1 虛擬視角之輔助資訊重建方法 39 3.3.2 深度對應轉換之區塊比對預測演算法 41 第四章 模擬結果比較 46 4.1 實驗數據參數設定 46 4.2 實驗數據比較 49 4.2.1 虛擬視角之輔助資訊品質 49 4.2.2 虛擬視角間解碼影像PSNR效能 51 4.2.3 多維度關聯性輔助資訊品質 54 4.2.4 多維度解碼影像PSNR效能 56 4.3 實驗結果展示 59 4.3.1 重建輔助資訊影像 59 4.3.2 解碼後WZ影像 64 4.4 輔助資訊方法之複雜度分析 68 4.4.1 複雜度分析 68 4.4.2 時間複雜度模擬結果 72 4.5 輔助資訊之錯誤分析 75 第五章 結論與未來展望 80 5.1 結論 80 5.2 未來研究探討 81 參考文獻 82

    [1] “Joint draft 8.0 on multi-view video coding,” JVT-AB204, Hannover, Germany, Jul. 2008.
    [2] C. Fehn, “Depth-Image-Based Rendering (DIBR), Compression and Transmission for a New Approach on 3D-TV,” Proceedings of SPIE Stereoscopic Displays and Virtual Reality Systems XI, pp. 93–104, San Jose, CA, USA, Jan. 2004.
    [3] A. Smolic, K. Muller, K. Dix, P. Merkle, P. Kauff, and T. Wiegand, “Intermediate View Interpolation Based on Multiview Video Plus Depth for Advanced 3D Video Systems,” Proc. ICIP 2008, IEEE International Conference on Image Processing, San Diego, CA, USA, Oct. 2008.
    [4] Ying Chen, Weixing Wan, Miska M. Hannuksela, Jun Zhang, Houqiang Li, Moncef Gabbouj “Depth-level-adaptive view synthesis for 3D video,” ICME 2010: 1724-1729
    [5] “Vision on 3D Video,” ISO/IEC JTC1/SC29/WG11, MPEG document N10357, Feb. 2009.
    [6] P. Kauff et al., “Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability,” Signal Processing: Image Communication, vol. 22, no. 2, pp. 217–234, 2007.
    [7] K. Muller et al., ”View synthesis for advanced 3D video systems,” Eurasip Journal on Image and Video Processing, no. 438148, Nov. 2008.
    [8] D. Slepian and J. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inf. Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
    [9] A. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inf. Theory, vol. 22, no. 1, pp. 1-10, Jan. 1976.
    [10] X. Artigas, J. Ascenso, M. Dalai, S. Klomp, D. Kubasov, and M. Ouaret, “The DISCOVER codec: architecture, techniques and evaluation,” in Proc. PCS, Lisbon, Portugal, Nov. 2007.
    [11] J. Ascenso, C. Brites, and F. Pereira, “Content adaptive Wyner-Ziv video coding driven by motion activity,” in Proc. IEEE ICIP, pp. 605-608, Oct. 2006.
    [12] R. Hartley and A. Zisserman, Multiple view geometry in computer vision, 2nd Ed., Cambridge University Press, ISBN: 0-521-54051-8, 2004.
    [13] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, Nov. 2004.
    [14] R. C. Bolles and M. A. Fischler, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communication ACM, vol. 24, no. 6, pp. 381-395, Jun. 1981.
    [15] X. Artigas, E. Angeli, and L. Torres, “Comparison of different side information generation methods for multiview distributed video coding,” in Proc. SIGMAP, Barcelona, Spain, pp. 450-455, Jul. 2007.
    [16] Information technology-coding of audio-visual objects-part 10: advanced video coding, ISO/IEC Std 14496-10, 2003.
    [17] D. Marpe, T. Wiegard, and G. J. Sullivan, “The H.264/MPEG4 advanced video coding standard and its applications,” IEEE Communication Magazine, vol. 44, no. 8, pp. 134-143, Aug. 2006.
    [18] Iain E. G. Richardson, H.264 and MPEG-4 video compression: video coding for next-generation multimedia, John Wiley & Sons, Ltd. ISBN: 0-470-84837-5, 2003.
    [19] D. N. Rowitch and L. B. Milstein, “On the performance of hybrid FEC/ARQ system using rate compatible punctured turbo (RCPT) codes,” IEEE Trans. Communication, vol. 48, no. 6, pp. 948-959, Jun. 2000.
    [20] L. R. Bahl, J. Cocke, F. Jelinek, and J. Racic, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory, vol. 20, no. 2, pp. 284-287, Mar. 1974.
    [21] G. Berrou, A. Glavieuc, and P. Thitmajshima, “Near Shannon limit error-correcting coding: turbo codes,” in Proc. IEEE ICC, pp. 1064-1070, May 1993.
    [22] P. Robertson, E. Villebrn, and P. Hoeher, “A comparison of optimal and sub-optimal MAP decoding algorithms operating in the log domain,” in Proc. IEEE ICC, vol. 2, pp. 1009-1013, Jun. 1995.
    [23] 陳慶華,「整合多視角與分散式視訊編碼改良示區塊比對預測演算法」,國立台灣科技大學電機工程系碩士論文,民99
    [24] C. Brites and F. Pereira, “Correlation noise modeling for efficient pixel and transform domain Wyner-Ziv video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 9, pp. 1177-1190, Sep. 2008.
    [25] D. Kubasov, K. Lajnef, and C. Guillemot, “A hybrid encoder/decoder rate control for Wyner-Ziv video coding with a feedback channel,” IEEE Workshop on MMSP, pp. 251-254, Oct. 2007.
    [26] A. Aaron, R. Zhang, and B. Girod, “Wyner-Ziv coding of motion video,” (Invited Paper) in Proc. Asilomar Conference on Signals and Systems, Pacific Grove, CA, Nov. 2002.
    [27] R. C. Bolles and M. A. Fischler, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communication ACM, vol. 24, no. 6, pp. 381-395, Jun. 1981.
    [28] M. Quaret, F. Dufaux, and T. Ebrahimi, “Mulitiview distributed video coding with encoder driven fusion,” in Proceedings of the European Conference on Signal Processing (EUSIPCO ’07), Poznan, Poland, September 2007.
    [29] M. Ouaret, F. Dufaux, and T. Ebrahimi, “Fusion-based multiview distributed video coding,” in Proc. ACM VSSN, pp. 139-144, Oct. 2006.
    [30] M. Ouaret, F. Dufaux, and T. Ebrahimi, “Iterative multiview side information for enhanced reconstruction in distributed video coding,” EURASIP Journal on Image Video Process., pp. 1-17, Jan. 2009.
    [31] Interactive Visual Media Group at Microsoft Research, http://www.research.
    microsoft.com/vision/ImageBasedRealities/3DVideoDownload/
    [32] Call for proposals on multi-view video coding, ISO/IEC JTC1/SC29/WG11, N7327, Jul. 2005.
    [33] A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” in Proc. IEEE, vol.93, no. 1, pp. 89-110, 2005.
    [34] “FreeViewpointTelevision(FTV),”http://www.tanimoto.nuee.nagoya-u.ac.jp/study/FTV
    [35] The official website of OpenCV: http://opencv.willowgarage.com
    [36] The official website of IT++: http://sourceforge.net/apps/wordpress/itpp/

    QR CODE