應用光流法於改善分散式多視角視訊編碼效能｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	徐意剛 Yi-Kang Hsu
論文名稱：	應用光流法於改善分散式多視角視訊編碼效能 Utilize Optical Flow Algorithm in Improving Distributed Multi-view Video Codec Performance
指導教授：	陳建中 Jiann-Jone Chen
口試委員:	郭天穎 Tien-Ying Kuo 鍾國亮 Kuo-Liang Chung 項天瑞 Tien-Ruey Hsiang
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2012
畢業學年度：	101
語文別：	中文
論文頁數：	82
中文關鍵詞：	多視角視訊編碼、分散式視訊編碼、多視角分散式視訊編碼、一階近似總變分光流法
外文關鍵詞：	Multi-view Video Coding, Distributed Video Coding, Multi-view Distributed Video Coding, Total Variation L1-norm Optical Flow
相關次數：	點閱：273 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

隨著視訊編碼技術進步，多媒體(multimedia)通信從以往被動地接收視訊資訊，提升到能夠真實呈現自然場景中的深度與立體感，目前三維視訊(3D Video)及多視角視訊編碼技術已成為下世代多媒體的發展主流之一。三維視訊編碼技術和多視角視訊編碼(Multi-view Video Coding, MVC)相較於傳統單一視角畫面影像編碼資料量更為龐大，運算複雜度極高，若要應用於可攜式編碼器或無線視訊感測網路的應用，需要採用分散式視訊編碼技術(Distributed Video Coding, DVC)處理多視角視訊影像傳輸的問題，將複雜的計算移至解碼端，簡稱為多視角分散式視訊編碼(Multi-view Distributed Video Coding, MDVC)。本論文我們以MDVC架構，整合時間域與視角間視訊畫面關聯性，提出改善視訊編解碼品質的方法，在GOP=1下，可整合Homography, Opticalflow及feedBack (HOB)改進SI；在GOP=2下，可運用Hybrid Homography, Opticalflow along Spatial and Temporal dimension (HHOST)改進SI：(1)提出透視轉換之兩個虛擬影像(Homography Left/Homography Right, HL/HR)於多視角分散式視訊聯合解碼中，對於輔助資訊影像疊影情況能明顯降低；(2)引入一階近似總變分光流法(Total Variation L1-norm Optical Flow, TV-L1 OF)，利用影像亮度特性，預測出每個像素點運動方向，達到最好的輔助資訊品質。實驗結果顯示於GOP=1下，本論文所提方法可提升影像PSNR約0.1~3dB，於渦輪解碼時間最多較視差補償視角預測(Disparity Compensation View Prediction, DCVP)減少約13%；於GOP=2下，相較於DCVP本方法可提高PSNR約4~9dB，相較於運動補償時間域內插(Motion Compensation Temporal Interpolation, MCTI)約提高1.5~3dB，且於渦輪解碼時間減少約15.73%。

With the advance of video communication technology, the multimedia platform can not only receive and play video streaming but also provide depth and stereo information of the natural scene for better visual perception. The 3DTV Video processing and multi-view video coding (MVC) becomes the main stream of the next generation multimedia processing. However, the huge amount of video data and required computation complexity for the 3DTV and MVC make it difficult to work on mobile encoder or wireless video sensor network platforms. The distributed video codec (DVC) can be utilized to shift encoder complexity to decoder under the MVC framework, denoted as Multi-view DVC(MDVC). We propose to utilize Optical Flow algorithm to improve motion estimation, together with Homography transform between inter-view videos, abbreviated as HOB for GOP=1 (Homography Opticalflow feedBack) and HHOST for GOP=2 (Hybrid Homography Opticalflow along Spatial and Temporal dimension), to better exploit temporal, intra- and inter-view video correlations among images to improve the MDVC performance. Detailed descriptions are: (1) The homography transformed images from left and right view ones, can be utilized at the MDVC joint decoder to reconstruct the side information (SI) without severe object overlapping artifacts; (2) To improve the confidence of SI images, the Total Variation L1-norm Optical Flow (TV-L1 OF) is adopted to predict pixel-based motion vectors of the left and right view images to the central to yield the SI image. Experiments showed that, for GOP=1, the proposed HOB outperformed previous works about 0.1~3dB in PSNR and reduced about 13% in turbo decoding time, as compared to the disparity compensation view prediction (DCVP) algorithm. For GOP=2, the proposed HHOST improved about 4~9dB and 1.5~3dB in PSNR as compared to the DCVP and the MCTI algorithms, respectively, and reduced the turbo decoding time about 15.73% as compared to the MCTI.

摘要	I
ABSTRACT	II
致謝	III
目錄	IV
圖目錄	VII
表目錄	X
第一章	緒論	1
1	研究動機與目的	1
2	問題描述及研究方法	2
3	論文組織	4
第二章	背景知識與相關研究	5
1	整合多視角與分散式視訊編碼	5
1.1	多視角視訊編碼起源	5
1.2	分散式視訊編碼	6
1.3	整合分散式視訊編碼與多視角視訊編碼架構	8
2	多視角關聯性之輔助資訊重建	9
2.1	運動補償時間域內插	9
2.2	視差補償視角預測	10
2.3	透視轉換模型	11
2.4	光流運動估測	13
3	相關模擬工具	18
3.1	H.264/AVC視訊編碼器	18
3.2	RCPT之渦輪編碼器	20
3.3	SIFT之特徵匹配法	24
3.4	HSV色彩編碼	28
第三章	改良之多視角分散式視訊編碼系統	29
1	MDVC系統	29
2	一階近似光流與透視轉換法	35
2.1	一階近似總變分光流法	35
2.2	透視轉換之虛擬影像	37
3	混合型光流與透視轉換法	40
3.1	虛擬視角之輔助資訊重建方法於GOP=1	40
3.2	輔助資訊重建之混合型光流與透視轉換演算法於GOP=2	43
第四章	模擬結果與比較	46
1	實驗參數設定	46
2	實驗數據比較	49
2.1	虛擬視角之輔助資訊品質	49
2.2	虛擬視角間解碼影像PSNR效能	52
2.3	多維度關聯性輔助資訊品質	55
2.4	多維度解碼影像PSNR效能	57
3	實驗結果展示	60
3.1	重建輔助資訊影像	60
3.2	解碼後WZF影像	66
3.3	重建輔助資訊影像與編解碼之時間複雜度	71
第五章	結論與未來展望	74
1	結論	74
2	未來展望	75
3	研究建議	76
參考文獻	80

                                

[1]ITU-T and ISO/IEC JTC1, “Joint draft 8.0 on multi-view video coding,” JVT-AB204, Jul. 2008.
[2]D. Slepian and J. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inf. Theory, vol. 19, no. 4, pp. 471-480, Jul. 1973.
[3]A. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inf. Theory, vol. 22, no. 1, pp. 1-10, Jan. 1976.
[4]X. Artigas, J. Ascenso, M. Dalai, S. Klomp, D. Kubasov, and M. Ouaret, “The DISCOVER codec: architecture, techniques and evaluation,” in Proc. PCS, Lisbon, Portugal, Nov. 2007.
[5]M. Flierl and B. Girod, “Coding of multi-view image sequences with video sensors,” in Proc. IEEE ICIP, pp. 609-612, Oct. 2006.
[6]X. Guo, Y. Lu, F. Wu, D. Zhao, and W. Gao, “Wyner-Ziv-based multiview video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 6, pp. 713-724, Jun. 2008.
[7]M. Ouaret, F. Dufaux, and T. Ebrahimi, “Iterative multiview side information for enhanced reconstruction in distributed video coding,” EURASIP Journal on Image Video Process., pp. 1-17, Jan. 2009.
[8]X. Artigas, E. Angeli, and L. Torres, “Side information generation for multiview distributed video coding using a fusion approach,” in Proc. NORSIG, Reykjavik, Iceland, pp. 250-253, Jun. 2007.
[9]M. Ouaret, F. Dufaux, and T. Ebrahimi, “Fusion-based multiview distributed video coding,” in Proc. ACM VSSN, pp. 139-144, Oct. 2006.
[10]J. Ascenso, C. Brites, and F. Pereira, “Content adaptive Wyner-Ziv video coding driven by motion activity,” in Proc. IEEE ICIP, pp. 605-608, Oct. 2006.
[11]M. Flierl, A. Mavlankar, and B. Girod, “Motion and disparity compensated coding for multi-view video, ” IEEE Trans. Circuits Syst. Video Technol., vol. 17, pp. 1474-1484, Nov. 2007.
[12]R. Hartley and A. Zisserman, Multiple view geometry in computer vision, 2nd Ed., Cambridge University Press, ISBN: 0-521-54051-8, 2004.
[13]D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, Nov. 2004.
[14]R. C. Bolles and M. A. Fischler, “Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography,” Communication ACM, vol. 24, no. 6, pp. 381-395, Jun. 1981.
[15]B. K. P. Horn and B. G. Schunck, “Determining Optical Flow,” Artificial Intelligence, vol. 17, pp.185-203, Aug. 1981.
[16]G. Bradski, A. Kaehler, Learning OpenCV: computer vision with the OpenCV library, O'Reilly Media, Inc., ISBN: 978-0-596-51613-0, 2008.
[17]Information technology-coding of audio-visual objects-part 10: advanced video coding, ISO/IEC Std 14496-10, 2003.
[18]D. Marpe, T. Wiegard, and G. J. Sullivan, “The H.264/MPEG4 advanced video coding standard and its applications,” IEEE Communication Magazine, vol. 44, no. 8, pp. 134-143, Aug. 2006.
[19]Iain E. G. Richardson, H.264 and MPEG-4 video compression: video coding for next-generation multimedia, John Wiley & Sons, Ltd. ISBN: 0-470-84837-5, 2003.
[20]D. N. Rowitch and L. B. Milstein, “On the performance of hybrid FEC/ARQ system using rate compatible punctured turbo (RCPT) codes,” IEEE Trans. Communication, vol. 48, no. 6, pp. 948-959, Jun. 2000.
[21]L. R. Bahl, J. Cocke, F. Jelinek, and J. Racic, “Optimal decoding of linear codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory, vol. 20, no. 2, pp. 284-287, Mar. 1974.
[22]G. Berrou, A. Glavieuc, and P. Thitmajshima, “Near Shannon limit error-correcting coding: turbo codes,” in Proc. IEEE ICC, pp. 1064-1070, May 1993.
[23]P. Robertson, E. Villebrn, and P. Hoeher, “A comparison of optimal and sub-optimal MAP decoding algorithms operating in the log domain,” in Proc. IEEE ICC, vol. 2, pp. 1009-1013, Jun. 1995.
[24]陳慶樺, “整合多視角與分散式視訊編碼改良示區塊比對預測演算法,” 國立臺灣科技大學電機工程系碩士學位論文, 2010.
[25]D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, Nov. 2004.
[26]D. Kubasov, K. Lajnef, and C. Guillemot, “A hybrid encoder/decoder rate control for Wyner-Ziv video coding with a feedback channel,” IEEE Workshop on MMSP, pp. 251-254, Oct. 2007.
[27]A. Chambolle and T. Pock, “A ﬁrst-order primal-dual algorithm for convex problems with applications to imaging,” CMAP, Ecole Polytechnique, Tech. Rep. R.I. 685, 2010.
[28]C. Zach, T. Pock, and H. Bischof, “A duality based approach for realtime TV-L1 optical flow,” In 29th DAGM Symposium on Pattern Recognition, pp. 214-223, 2007.
[29]盧其均, “整合多視角視訊深度與灰階資訊對應以改進分散式編碼器輔助資訊品質的方法,” 國立臺灣科技大學電機工程系碩士學位論文, 2011.
[30]T. Brox, A. Bruhn, N. Papenberg, and J. Weickert, “High accuracy optical ﬂow estimation based on a theory for warping,” in European Conf. Computer Vision (ECCV), pp. 25–36, 2004.
[31]Call for proposals on multi-view video coding, ISO/IEC JTC1/SC29/WG11, N7327, Jul. 2005.
[32]Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6), “JSVM Software Manual,” JSVM 9.18, 19 Jun. 2009.
[33]J. Lu, X. Zhang, and L. Wu, “Distributed video coding technology based on H.264 and turbo code,” in Proc. CISP, vol. 1, pp. 516-520, May 2008.
[34]J. Slowack, J. Škorupa, S. Mys, P. Lambert, C. Grecos, and R. Van de Walle, “Flexible distribution of complexity by hybrid predictive-distributed video coding,” EURASIP Journal on Signal Processing: Image Communication, vol. 25, no. 2, pp. 94-110, Feb. 2010.
[35]A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” in Proc. IEEE, vol.93, no. 1, pp. 89-110, 2005.
[36]G. Bjontegaard, “Calculation of Average PSNR Differences Between RD-Curves, ” ITUT-T Q6/SG16, Doc. VCEG-M33, Apr. 2001.
[37]“Free Viewpoint Television (FTV),” http://www.tanimoto.nuee.nagoya-u.ac.jp/study/FTV
[38]The official website of OpenCV: http://opencv.willowgarage.com
[39]The official website of IT++: http://sourceforge.net/apps/wordpress/itpp/
[40]The official website of LibJacket: http://www.honghutech.com/software/accelereyes-jacket/libjacket
[41]Nvidia CUDA: http://www.nvidia.com/object/cuda_home_new.html
[42]M. Quaret, F. Dufaux, and T. Ebrahimi, “Mulitiview distributed video coding with encoder driven fusion,” in Proceedings of the European Conference on Signal Processing (EUSIPCO ’07), Poznan, Poland, September 2007.

簡易檢索 / 詳目顯示

相關論文