簡易檢索 / 詳目顯示

研究生: 宋啟嘉
Chi-Chia Sun
論文名稱: 兼具低功率訊號處理以及高品質轉換的Cordic based Loeffler DCT架構
A Low-Power and High-Quality Cordic Based Loeffler DCT for Signal Processing
指導教授: 阮聖彰
Shanq-Jang Ruan
口試委員: 尤根
Juergen Goetze
楊佳玲
Chia-Ling Yang
賴坤財
Kuen-Tsair Lay
許孟超
Mon-Chau Shie
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 74
中文關鍵詞: 低功率離散餘弦轉換高品質
外文關鍵詞: Discrete Cosine Transform, low-power, Loeffler DCT, Cordic based DCT, high-quality
相關次數: 點閱:234下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,關於低功率的研究,已逐漸成為目前各個領域的研究焦點。先前關於Discrete Cosine Transform (DCT) 低功率的研究,大部份著重在如何降低電路的功率消耗,或是減少整體DCT的運算量以達到低功率的目的。然而,隨著時代的改變與技術的進步,為了滿足目前使用者對於的高水準影像品質的要求,兼顧低功率運算與高品質壓縮的研究設計逐漸成為近來關注的焦點。因此,在本論文中,我們提出了一個兼具低功率訊號處理以及高品質轉換的Cordic based Loeffler DCT架構。此架構只需要使用到38個加法器跟16個移位器運算就可以完成無乘法器的DCT轉換。

本論文使用TSMC 0.13-μm製程的技術來合成所提出的新架構,並且使用了Synopsys PrimePower來量測Gate Level的功率消耗。除此之外,也同時將所提出的架構崁入到JPEG和MPEG 4 Xvid內。根據模擬的數據結果,新架構除了只需原先Loeffler DCT 19%的面積和只消耗16%的功率之外,還同時可以維持跟Loeffer DCT相似的轉換品質。

最後,本論文所提出的Cordic based Loeffler DCT架構除了非常適合低功率高品質的CODECs外,同時也非常適合實現於VLSI設計之上。尤其是適合應用崁入式系統、高品質CODECs和低功率行動裝置。


In this master thesis, a low-power and high-quality preserving DCT architecture is presented. It is obtained by optimizing the Loeffler DCT based on the Cordic algorithm. The computational complexity is reduced significantly from 11 multiply and 29 add operations (Loeffler DCT) to 38 add and 16 shift operations (i.e., similar to
the complexity of the binDCT) without sacri‾cing the transformation accuracy. This implementation can also perform multiplierless DCT transformation as binDCT does.

In our experiments, we used different criteria to evaluate the proposed DCT architecture. After synthesizing with TSMC 0.13-um technology library, Synopsys PrimePower was used to estimate the power consumption at gate-level. Then we have embedded our DCT architecture into the JPEG and XVID CODECs to compare and analyze the quality of the compression results. The experimental results show that the proposed DCT architecture only occupies 19% of the area and consumes about 16% of the power of the Loe²er DCT. Moreover, it also retains the good transformation quality of the original Loeffler DCT.

As a result, the proposed Cordic based Loeffler DCT is not only very suitable for low-power and high-quality CODECs but also highly suited for VLSI-implementations since it only needs add and shift operations to carry out DCT transformation. Finally, it is worth noticing that the presented Cordic based Loeffler DCT architecture is especially suited for embedded systems, high-quality CODECs and mobile hand-held
devices due to its low power and small area properties.

Table of Contents iv List of Tables vi List of Figures vii Abstract i Acknowledgements iii Introduction 1 1 DCT Algorithms 4 1.1 The DCT Backgroud . . . . . . . . . . . . . . . . . . . . . . . 4 1.2 Loeffler DCT . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 binDCT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 Cordic Algorithms 11 2.1 Cordic Algorithm Background . . . . . . . . . . . . . . . . . . 11 2.2 Cordic Rotation and Compensation . . . . . . . . . . . . . . . .12 2.3 Cordic Architecture . . . . . . . . . . . . . . . . . . . . . . 15 2.4 Cordic based DCT . . . . . . . . . . . . . . . . . . . . . . . .18 3 Low Power Background 22 3.1 Synthesis for Low Power . . . . . . . . . . . . . . . . . . . . 22 3.2 Source of Power Dissipation . . . . . . . . . . . . . . . . . . 25 3.3 Low Power DCT Architecture . . . . . . . . . . . . . . . . . . .26 4 Cordic based Loeffler DCT 28 4.1 Cordic based Loeffler DCT Algorithm . . . . . . . . . . . . . . 28 4.2 Improved Cordic based Loeffler DCT . . . . . . . . . . . . . . .30 4.3 Flow Graph of the Cordic based Loe²er DCT . . . . . . . . . . . 32 5 Experimental Results 39 5.1 Simulation Environment . . . . . . . . . . . . . . . . . . . . .39 5.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . .40 5.3 Performance of Cordic based Loe²er DCT in JPEG . . . . . . . . .43 5.4 Performance of Cordic based Loe²er DCT in XVID . . . . . . . . .46 5.5 Evaluation Metric of Power and Quality . . . . . . . . . . . . .48 6 Conclusion 51 Bibliography 53

[1] The JPEG-6b Website, http://www.ijg.org/, 1998.
[2] The XVID Website, http://www.xvid.org/, 2005.
[3] N. J. August and Dong Sam Ha, Low power design of DCT and IDCT for low bit
rate video codecs, IEEE Transactions on Multimedia 6 (2004), no. 3, 414-422.
[4] Luca Benini and Giovanni De Micheli, System-Level Power Optimization: Techniques and Tools, ACM Tran. Design Automation of Electronc Systems 5 (2000), no. 2, 115-192.
[5] Luca Benini and Givanni De Micheli, Dynamic Power Management: Design Techniques and CAD Tools, Kluwer Academic Publishers, 1998.
[6] Wen Hsiung Chen, C. Smith, and S. Fralick, A Fast Computational Algorithm for the Discrete Cosine Transform, IEEE Trans. Commun. 25 (1977), no. 9, 1004-1009.
[7] Philip P. Dang, Paul M. Chau, Truong Q. Nguyen, and Trac D. Tran, BinDCT
and Its E±cent VLSI Architectures for Real-Time Embedded Applications, Journal of Image Science and Technology 49 (2005), no. 2, 124-137.
[8] P. Duhamel and H. H'Mida, New 2n DCT algorithms suitable for VLSI implementation, IEEE International Conference on ICASSP, vol. 12, April 1987, pp. 1805-1808.
[9] L. Fanucci and S. Saponara, Data driven VLSI computation for low power DCT based video coding, International Conf. on Electronics, Circuits and Systems,September 2002, pp. 541-544.
[10] J. Goetze and G. Hekstra, An algorithm and architecture based on orthonormal micro-rotations for computing the symmetric EVD, In Integration - The VLSI Journal 20 (1995), 21-39.
[11] B. Heyne, M. Bucker, and J. Gotze, Implementation of a Cordic Based FFT on a Reconfigurable Hardware Accelerator, 3rd Karlsruhe Workshop on Software Radios, 2004.
[12] B. Heyne and J. Gotze, A Pure Cordic Based FFT for Reconfigurable Digital Signal Processing, 12th European Signal Processing Conference, September 2004.
[13] Hsieh Hou, A fast recursive algorithm for computing the discrete cosine transform, IEEE Trans. Acoust., Speech, Signal Processing 35 (1987), no. 10, 1455-1461.
[14] S.F. Hsiao, Y.H. Hu, T.B. Juang, and C.H. Lee, Efficient VLSI Implementations of Fast Multiplierless Approximated DCT Using Parameterized Hardware Modules for Silicon Intellectual Property Design, IEEE Trans. Circuits Syst. I 52 (2005), no. 8, 1568-1579.
[15] Hyeonuk Jeong, Jinsang Kim, and Won Kyung Cho, Low-power multiplierless DCT architecture using image correlation, IEEE Trans. Consumer Electron. 50 (2004), no. 1, 262-267.
[16] Yeonsik Jeong, Imgeun Lee, Hak Soo Kim, and Kyu Tae Park, Fast DCT algorithm with fewer multiplication stages, Electronics Letters 34 (1998), no. 8, 723-724.
[17] Kyeounsoo Kim and P. A. Beerel, A high-performance low-power asynchronous matrix-vector multiplier for discrete cosine transform, IEEE Asia Pacific Conf. on ASICs, August 1999, pp. 135-138.
[18] Byeong Lee, A new algorithm to compute the discrete cosine Transform, IEEE Trans. Acoust., Speech, Signal Processing 32 (1984), no. 6, 1243-1245.
[19] J. Li and Shih Lien Lu, Low power design of two-dimensional DCT, IEEE Conf. on ASIC and Exhibit, September 1996, pp. 309-312.
[20] Jie Liang and T. D. Trac, Fast multiplierless approximations of the DCT with the lifting scheme, IEEE Trans. Acoust., Speech, Signal Processing 49 (2001), no. 12, 3032-3044.
[21] C. Loeffler, A. Lightenberg, and G. S. Moschytz, Practical fast 1-D DCT algorithms with 11-multiplications, Proc. ICASSP (Glasgow,UK), vol. 2, May 1989, pp. 988-991.
[22] E.P. Mariatos, D.E. Metafas, J.A. Hallas, and C.E. Goutis, A fast DCT processor, based on special purpose CORDIC rotators, IEEE International Symposium on Circuits and Systems, vol. 4, May 1994, pp. 271-274.
[23] Giovanni De Micheli, Synthesis and Optimization of Digital Circuits, Electrical Engineering, McGraw-Hill, New York, 1994.
[24] C. Y. Pai, W. E. Lynch, and A. J. Al-Khalili, Low-power data-dependent 8/spl times/8 dct/idct for video compression, IEE Proc. Vision, Image and Signal Processing 150 (2003), no. 4, 245-255.
[25] Keshab K. Parhi and Takao Nishitani, Digial Signal Processing for Multimedia Systems, MARCEL DEKKER, New York, 1999.
[26] , Digial Signal Processing for Multimedia Systems, Prentice Hall, New Jersey, 2001.
[27] Jongsun Park, Soonkeon Kwon, and K. Roy, Low power reconfigurable DCT design based on sharing multiplication, IEEE International Conf. on Acoustics, Speech, and Signal Processing, vol. 3, May 2002, pp. III3116-III3119.
[28] Massoud Pedram, Power analysis and optimization, http://atrak.usc.edu/massoud/.
[29] Power Minimization in IC Design: Principles and Applications, ACM Tran. on Design Automation of Electronc Systems 1 (1996), no. 1, 3-56.
[30] Massoud Pedram and Hirdneu Vaishnav, Power Optimization in VLSI Layout: A Survey, Tech. report, Dept. of EE-systems, University of Southern. Synopsys Inc., http://atrak.usc.edu/massoud/.
[31] Iain E. G. Richardson, Video Codec Design, John Wiley & Sons, Atrium, England, 2002.
[32] H.264 and MPEG-4 Video Compression, John Wiley & Sons, Atrium, England, 2003.
[33] P. Rieder, K. Gergano, J. Goetze, and J.A. Nossek, Parametrization and Implementation of Orthogonal Wavelet Transforms, Proc. IEEE Int. Conf. on Acoust., Speech, Signal Processing (Atlanta, U.S.A), 1996.
[34] Kaushik Roy and Sharat C. Prasad, Low-Power CMOS VLSI Circuit Design, John Wiley, New York, 2000.
[35] A. Shams, W. Pan, A. Chidanandan, and M. A. Bayoumi, A low power high performance distributed DCT architecture, IEEE Computer Society Annual Symposium on VLSI, April 2002, pp. 21-27.
[36] N. Suehiro and M. Hatori, Fast algorithms for the DFT and other sinusoidal transforms, IEEE Trans. Acoust., Speech, Signal Processing 34 (1986), no. 3, 642-644.
[37] Chi Chia Sung, Shanq Jang Ruan, Bo Yao Lin, and Mon Chau Shie, Quality and Power Efficient Architecture for the DiscreteCosine Transform, IEICE Transactions on Fundamentals Special Section on VLSI Design and CAD Algorithms E88-A (2005), no. 5.
[38] T. D. Trac, A fast multiplierless block transform for image and video compression, International Conf. on Image Processing, October 1999, pp. 822-826.
[39] , The binDCT: fast multiplierless approximation of the DCT, IEEE Signal Processing Lett. 7 (2000), 141-144.
[40] M. Vetterli and H. Nussbaumer, Simple FFT and DCT algorithms with reduced number of operations, Signal Process 6 (1984), 264-278.
[41] J.E. Volder, The CORDIC trigonometric computing technique, IRE Trans. Electron. Comput. EC-8 (1959), 330-334.
[42] J.S. Walther, A unified algorithm for elementary functions, Proc. Spring Joint Comput. Conf., vol. 38, 1971, pp. 379-385.
[43] Zhongde Wang, Fast algorithms for the discrete W transform and for the discrete Fourier transform, IEEE Trans. Acoust., Speech, Signal Processing 32 (1984), no. 4, 803-816.
[44] T. Xanthopoulos and A. P. Chandrakasan, A low-power DCT core using adaptive bitwidth and arithmetic activity exploiting signal correlations and quantization, IEEE Journal of Solid-State Circuits 35 (2000), no. 5, 740-750.
[45] Gary Yeap, Practical Low Power Digital VLSI Design, Kluwer Academic, 1998.
[46] Sungwook Yu and E.E. Swartzlander, A scaled DCT architecture with the CORDIC algorithm, IEEE Trans. Acoust., Speech, Signal Processing 50 (2002), no. 1, 160-167.

無法下載圖示 全文公開日期 2007/04/18 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE