簡易檢索 / 詳目顯示

研究生: 盧永晟
Uang-Chang Lu
論文名稱: 基於 H.264 動態估測硬體設計之快速模式決擇演算法結合自適應降取樣方法
A Fast Mode Decision Algorithm incorporates with the Adaptive Down-Sampling for H.264/AVC Motion Estimation and its VLSI Design
指導教授: 呂學坤
Shyue-Kung Lu
口試委員: 郭斯彥
Sy-Yen Kuo
李進福
Jin-Fu Li
陳俊良
Jiann-Liang Chen
洪進華
Jin-Hua Hong
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 中文
論文頁數: 78
中文關鍵詞: H.264 高階視訊編碼動態估測模式決擇演算法降取樣演算法動態搜尋範圍均差運算多重小型處理單元陣列結構
外文關鍵詞: H.264 AVC, Motion Estimation, Mode Decision Algorithm, Down-Sampling Algorithm, Search Range Adjustment, Mean Deviation, Multiple Small Scale Process Element Array
相關次數: 點閱:333下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

影像編碼技術 H.264/AVC 採用了許多新方法使得編碼效率比之前的標準提升至少50%,因此已成為主流的編碼方式。但新的方法也大幅增加編碼的運算量與複雜度。以 HDTV 規格為例,整個視訊編碼過程每秒有 3600 Giga 指令指派與 5570 GBytes 資料傳輸,值得注意的是其中約有 95% 是來自動態估測 (Motion Estimation),為此已有諸多研究針對動態估測提出加速與化簡的方法。

快速搜尋演算法、降取樣演算法、模式決擇演算法以及針對硬體規劃的資料再利用設計與強化平行處理的硬體結構,諸多方法都能在一定的成本下有效加速動態估測,然而在追求更高解析度與需要即時編碼的狀況下,又因演算法與硬體結合後彈性受限,所付出的電路成本與運算時間逐漸攀升。因此,本研究將基於以必要運算才進行運算的概念下,結合前人研究方法,提出以均差運算為核心的均差訊息模式抉擇 (Mean Deviation Information Mode Decision, MDIMD) 結合自適應降取樣 (Adaptive Down-Sampling, ADS) 演算法來降低運算量與資料傳輸量,並藉自適應降取樣的特性,提出多重小型處理單元陣列 (Processing Element Array, PEA) 強化平行運算能力。

均差訊息模式抉擇會預測可變區塊大小 (Variable Block Size) 的數種組合模式,並由率失真成本 (Rate-Distortion Cost) 公式來決定組合模式以產生較可靠結果。做為核心的均差運算計算單純,與處理單元陣列整合可以重複利用硬體。自適應降取樣演算法是依區塊的均值程度做動態的降取樣比率調整,排除多餘的相似運算的同時,保留一定的編碼品質。此外藉由鄰近區塊的動態向量 (Motion Vector) 的均差運算結果,既可避免畫面劇烈變化時,降取樣所帶來的影像失真,也可利用來調整動態估測的搜尋範圍 (Search Range),大幅減少動態估測的運算量。由於自適應降取樣演算法的特性,多重小型處理單元陣列可以組合處理未取樣區塊,若是已取樣的區塊更是以數倍速度進行。

模擬數據結果顯示 PSNR 損失微乎其微,而動態估測時間卻可節省約 70% ∼ 75%,且 Bitrate 約僅有1 ∼ 2% 的增加,此外本研究尚可結合其他快速動態估測演算法來加速運算。硬體實作以四個小型 PEA 為例,使用製程TSMC 0.18 um,操作頻率為 125 MHz,邏輯閘總數 181 K,記憶體 3 KBytes,設定搜尋範圍為為 32 × 32,可調為 16 × 16,最快狀況可以用 64 時脈週期處理完一 1:4 降取樣區塊,最慢狀況也能用 1350 時脈週期處理完一未取樣區塊。


H.264/AVC standard adopts several new coding methods which not only gain at least 50 percent improvement in the coding performance as compared to previous standards but also increase a lot of computation complexity. Using the HDTV as an example, the overall H.264 coding process requires 3,600 Giga instructions per second and 5,570 GBytes data transmission bandwidth. According to statistics, about 95% of computations are related to the motion estimation (ME).

There are many methodsthe fast search algorithms, down-sampling algorithms, mode decision algorithms, and hardware designs, which target on the circuit designs, data reuse, and parallel computation in order to efficiently accelerate the ME computations. However, the incurred cost keeps increasing because the requirement of video resolution and real-time coding are still required. Therefore, in this thesis, based on the concepts of only perform the “must” computations and the previous methods, we propose the Mean Deviation Information Mode Decision (MDIMD) algorithm which incorporates with the Adaptive Down Sampling (ADS) algorithm in order to reduce the computation complexity while maintains the required quality level. Moreover, Multiple Small-Scale Processing Element Arrays (PEA) are proposed as the computation core for the ME to enhance the parallel computation ability.

The proposed MDIMD predicts the mode of the variable block sizes and let the rate-distortion cost (RDC) detemine the exact mode combination. The results are more dependable. As the core of MDIMD, the computation of mean deviation is simple and can be performed by reusing the PEAs. According to the results of MDIMD, the ADS algorithm adaptively choices a proper down-sampling rate for excluding low-impact computations and maintains the video quality. Moreover, by analyzing the neighboring block's motion vctors (MV), their mean deviation could be used for identifying high-motion regions to avoid the aliasing problem and dynamically adjust the search range. Since the ADS algorithm is adopted, the small-sized PEAs could process a down-sampled block such that the multiple small-scale PEAs can work together to process a none down-sampled block and simultaneously process multiple down-sampled blocks.

Simulation results show that the lost PSNR is negligible and the ME computation time can be reduces about 70% to 75% while the bitrate only increases 1% to 2%. Note that our algorithm is compatible with other fast ME algorithms for further speed up. Hardware implementation based on four small scale PEAs is conducted with TSMC 0.18 um process, which can operate at 125 MHz while 3-Kbytes SRAMs are used. We assume that the search range is 32 × 32. We could finish a current block search within only 64 clock cycles in the best case and 1350 clock cycles in the worst case.

中文摘要 i 英文摘要 ii 誌謝 iv 目錄 v 圖目錄 vii 表目錄 ix 第 1 章 緒論 1 1.1 研究背景與動機 1 1.2 章節概述 2 第 2 章 H.264 與動態估測編碼簡介 3 2.1 H.264 編碼規範 3 2.1.1 名詞定義與技術 3 2.1.2 編碼流程 7 2.2 動態估測 8 第 3 章 文獻研究 11 3.1 快速動態估測演算法 11 3.1.1 快速搜尋演算法 11 3.1.2 降取樣演算法 13 3.1.3 動態搜尋範圍 15 3.2 模式抉擇演算法與均值偵測 16 3.3 資料再利用設計 20 3.4 硬體結構設計 22 3.4.1 高平行性處理結構 22 3.4.2 邊緣資訊模式抉擇 24 3.5 小結與討論 27 第 4 章 均差資訊模式抉擇與自適應降取樣演算法 28 4.1 均差運算 28 4.2 均差資訊模式抉擇與自適應降取樣演算法 32 4.3 均差動態向量 33 4.4 硬體設計 36 4.4.1 硬體結構與動作流程 36 4.4.2 目前緩衝器與均差計算 38 4.4.3 參考緩衝器與多重小型處理單元陣列結構 38 4.4.4 記憶體配置與資料流 43 第 5 章 實驗結果 51 5.1 模擬結果 51 5.2 硬體實作 60 5.2.1 硬體規格與合成結果 60 5.2.2 效能分析與比較 61 第 6 章 結論 63 參考文獻 64

[1] T. Wiegand, G. J. Sullivan, G. Bjontegaard, and A. Luthra, “Overview of the H.264/AVC Video Coding Standard,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560-576, July 2003.
[2] T. C. Chen, S. Y. Chien, Y. W. Huang, C. H. Tsai, C. Y. Chen, T. W. Chen, and L. G. Chen, “Analysis and Architecture Design of an HDTV720p 30 Frames/s H.264/AVC Encoder,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 6, pp.673-688, June 2006.
[3] Iain E. G. Richardson, “H.264 and MPEG-4 Video Compression : Video Coding for Next Generation Multimedia,” John Wiley & Sons Ltd., Dec. 2003.
[4] C. Y. Chen, S. Y. Chien, Y.W. Huang, T.C. Chen, T.C. Wang, and L.G. Chen, “Analysis and Architecture Design of Variable Block-Size Motion Estimation for H.264/AVC,” IEEE Transactions on Circuits and Systems—I: Regular Papers, vol. 53, no. 2, pp. 578-593, Feb. 2006.
[5] T. H. Tsai and Y. N. Pan, “A Novel Predict Hexagon Search Algorithm for Fast Block Motion Estimation on H.264 Video Coding,” IEEE Asia-Pacific Conference on Circuits and Systems, vol. 1, pp. 609-612, Dec. 2004.
[6] T. H. Tsai and Y. N. Pan, “A 3-D Predict Hexagon Search Algorithm for Fast Block Motion Estimation on H.264 Video Coding,” IEEE International Conference on Multimedia and Expo, pp .658-661, Jul. 2005.
[7] T. H. Tsai and Y. N. Pan, “A Novel 3-D Predict Hexagon Search Algorithm for Fast Block Motion Estimation on H.264 Video Coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 12, pp. 1542-1549, Dec. 2006.
[8] Y. W. Huang, S. Y. Chien, B. Y. Hsieh, and L. G. Chen, “Global Elimination Algorithm and Architecture Design for Fast Block Matching Motion Estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 6, pp. 898-907, June 2004.
[9] W. Li and E. Salari, “Successive Elimination Algorithm for Motion Estimation,” IEEE Transactions on Image Processing, vol. 4, pp. 105-107, Jan. 1995.
[10] X. Q. Gao, C. J. Duanmu, and C. R. Zou, “A Multilevel Successive Elimination Algorithm for Block Matching Motion Estimation,” IEEE Transactions on Image Processing, vol. 9, no. 3, pp. 501-504, Mar. 2000.
[11] M. Brunig and W. Niehsen, “Fast Full-Search Block Matching,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 11, no. 2, pp. 241-247, Feb. 2001.
[12] K. B. Lee, H. Y. Chin, H. C. Hsu, and C. W. Jen, “QME: An Efficient Subsampling-Based Block Matching Algorithm for Motion Estimation,” International Symposium on Circuits and Systems, vol. 2, pp. 305-308, May 2004.
[13] C. N. Wang , S. W. Yang, C. M. Liu, and T. Chiang, “A Hierarchical N-Queen Decimation Lattice and Hardware Architecture for Motion Estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 4, pp. 429-440, Apr. 2004.
[14] Y. Q. Huang, Q. Liu, S. Goto, and T. Ikenaga, “Adaptive Subsampling and Motion Feature based Fast H.264 Motion Estimation,” International Congress on Image and Signal Processing, vol. 2, pp.671-675, May 2008.
[15] Y. K. Lai, L. F. Chen, and S. Y. Huang, “Hybrid Parallel Motion Estimation Architecture Based on Fast Top-Winners Search Algorithm,” IEEE Transactions on Consumer Electronics, vol. 56, no. 3, pp. 1837-1842, Aug. 2010.
[16] J. Y. Tham, S. Ranganath, M. Ranganath, and A. A. Kassim, “A Novel Unrestricted Center-Biased Diamond Search Algorithm for Block Motion Estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 8, no. 4, pp. 369-377, Aug 1998.
[17] T. C. Chen, Y. H. Chen, S. F. Tsai, S. Y. Chien, L. G. Chen, “Fast Algorithm and Architecture Design of Low-Power Integer Motion Estimation for H.264/AVC, ” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no. 5, pp. 568-577, May 2007.
[18] D. Wu, F. Pan, K. P. Lim, S. Wu, Z. G. Li, X. Lin, S. Rahardja, and C. C. Ko, “Fast Intermode Decision in H.264/AVC Video Coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 15, no. 6, pp. 953-958, July 2005.
[19] Wikipedia contributors, “Sobel operator,” in Wikipedia: The Free Encyclopedia, available from http://en.wikipedia.org/wiki/Sobel_operator, Retrieved Dec. 2011.
[20] Y. N. Pan and T. H. Tsai, “Fast Motion Estimation and Edge Information Inter-Mode Decision On H.264 Video Coding,” IEEE International Conference on Image Processing, vol. 2, pp. 473-476, Sep. 2007.
[21] H. M. Wang, J. K. Lin, and J. F. Yang, “Fast H.264 Inter Mode Decision Based on Inter and Intra Block Conditions,” IEEE International Symposium on Circuits and Systems, pp.3647-3650, May 2007.
[22] L. Shen, Z. Liu, Z. Zhang, and X. Shi, “Fast Inter Mode Decision Using Spatial Property of Motion Field,” IEEE Transactions on Multimedia, vol. 10, no. 6, pp. 1208-1214, Oct. 2008.
[23] L. Shen; Z. Liu, Z. Zhang, “An Efficient Intermode Decision Algorithm Based on Motion Homogeneity for H.264/AVC,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 19, no. 1, pp. 128-132, Jan. 2009
[24] M. Y. Hsu, H. C. Chang; Y. C. Wang, and L. G. Chen, “Scalable Module-Based Architecture for MPEG-4 BMA Motion Estimation,” IEEE International Symposium on Circuits and Systems, vol. 2, pp. 245-248, May 2001.
[25] J. C. Tuan, T. S. Chang, and C. W. Jen, “On The Data Reuse and Memory Bandwidth Analysis for Full-Search Block-Matching VLSI Architecture,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no. 1, pp. 61-72, Jan. 2002.
[26] C. Y. Chen, C. T. Huang, and L. G. Chen, “Level C+ Data Reuse Scheme for Motion Estimation With Corresponding Coding Orders,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no. 4, pp. 553-558, Apr. 2006.
[27] C. Y. Kao and Y. L. Lin, “A Memory-Efficient and Highly Parallel Architecture for Variable Block Size Integer Motion Estimation in H.264/AVC,” IEEE Transactions on VLSI systems, vol. 18, no. 6, pp. 866-874, June 2010.
[28] T. H. Tsai and Y. N. Pan, “High Efficiency Architecture Design of Real-Time QFHD for H.264/AVC Fast Block Motion Estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 11, pp. 1646-1658, Nov. 2011.
[29] A. C. Tsai, K. Bharanitharan, J. F. Wang, and K. I. Lee, “Effective Search Point Reduction Algorithm and Its VLSI Design for HDTV H.264/AVC Variable Block Size Motion Estimation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 7, pp. 981-988, July 2012.

QR CODE