研究生: |
吳奇育 Chi-Yu Wu |
---|---|
論文名稱: |
低功率之預先計算內容可定址記憶體 A New Block-XOR Precomputation-Based Content Addressable Memory for Low Power System |
指導教授: |
阮聖彰
Shanq-Jang Ruan |
口試委員: |
楊佳玲
Chia-Lin Yang 陳宏明 Hung-Ming Chen 許孟超 Mon-Chau Shie 張延任 Yen-Jen Chang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2006 |
畢業學年度: | 94 |
語文別: | 中文 |
論文頁數: | 65 |
中文關鍵詞: | 記憶體 、低功率 、內容可定址記憶體 |
外文關鍵詞: | memory, low pwoer, content addressable memory |
相關次數: | 點閱:241 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在很多應用例如lookup table,database,associative computing和gigabit Ethernet, 因為內容可定址記憶體提供平行的比對以降低搜索時間,所以經常被用來改良這些應用的搜尋速度。 不過,平行的比對總是消耗較多的功率。 在本篇論文裡,為節省更多的功率消耗因此提出一個新的Precomputation-Based Content Addressable Memory (PB-CAM) 架構,此方法是根據PB-CAM的設計而加以改良。雖然PB-CAM能藉由減少比對動作而降低功率消耗,然而原始特徵值擷取器的設計方法會大幅限制所減少的比對動作數量。所以本論文設計一個新的特徵值擷取方法”Block-XOR”來改進PB-CAM的效率。依數學分析顯示若是輸入資料長度是32位元,本篇論文的方法將可以節省到一半的比對次數。在實驗過程中,使用TSMC 0.18-μm,1.8V。為求精準,本論文使用Synopsys Nanosim 量測功率消耗。
由實驗結果顯示,當Block-XOR與Ones count皆採用標準的記憶體電路設計 (9 顆電晶體),Block-XOR可以節省72%的功率消耗。甚至是讓Ones count 採用特殊的記憶體電路設計 (7顆電晶體),本論文所提出的方法也可以節省33%功率。 這篇文章的主要的目的是提供理論和實際的證據證實Block-XOR PB-CAM系統能取得更進一步的功率減少,並且CAM記憶體單元電路設計不需要使用特別的設計。也就是說Block-XOR可以適用於更多種類的記憶體電路設計已達到設計者的需求並且額外的提供低功率的效果。
In many applications such as lookup table, database, associative computing and gigabit Ethernet, the content addressable memory (CAM) is often used to improve the application performance since CAM provides the parallel comparison to reduce search time. However, the parallel comparisons always consume a significant amount of power. In this paper, we propose a new precomputation-based content addressable memory (PB-CAM) structure for saving the CAM system power. Our approach is based on the PB-CAM. Although the PB-CAM can reduce the comparison operations to reduce power consumption by precomputation, it suffers from that the precomputation-based ones count approach limits the reduction amount of comparison operations. In this paper, we devise a Block-XOR approach to improve the efficiency of PB-CAM. The mathematic analysis shows that our approach can effectively reduce the average comparison number to 50% under 32 bits compared to ones count approach. Incidentally, the ones count is three times the area of our Block-XOR in the parameter extractor circuit. In the experiment, we use TSMC 0.18-μm to estimate the power by Synopsys Nanosim for accuracy. Compared to ones count approach, the experimental result shows that our approach achieves 72% average power reduction in the standard memory cell design (nine transistors). Furthermore, our Block-XOR with standard memory cell design consumes 33% power of ones count with special memory cell design (seven transistors) for its best case. Moreover, the special memory cell is not standard design and is only suitable for ones count approach. The major contribution of this paper is that we provided theoretical and practical proof to verify that our Block-XOR PB-CAM system can achieve further power reduction, and the CAM cell design does not need to use the special design. It implies that our approach is more flexible and adaptive to the general design. Further, due to invariable comparison operation number (constant delay of search operation), the Block-XOR PB-CAM system is even more suitable for real-time application.
[ 1 ] H. Yamada, Y. Murata, T. Maeda, R. Ikeda, K. Motohashi, and K. Takaha-shi, “Real-time string search engine LSI for 800-Mbit/s LANs,“ in IEEE 1988 Custom Integrated Circuits Conf., pp. 21.6.1-21.6.4, 1988.
[ 2 ] P. M. Miller, A. R. Hurson, and R. H. Hettmansperger, ”Modular scheme for designing associative memorys,” Comp. Syst. Sci. Eng., vol. 31, no. 3, pp. 166-181, Jul. 1993.
[ 3 ] K. Schultz, F. Shafai, R. Gibson, A. Bluschke, and D. Somppi, “Fully parallel 25-MHZ 2.5-MB CAM,” IEEE ISSCC Dig. Tech. Papers, pp. 335 – 333, 1998.
[ 4 ] H. Miyatake, M. Tanaka, and Y. Mori, “A design for high-speed low power CMOS fully parallel content-addressable memory macros,” IEEE J. Solid-State Circuits, vol. 36, pp. 956–968, June 2001.
[ 5 ] I. Arsovski, T. Chandler, and A. Sheikholeslami, “A ternary content- addressable memory (TCAM) based on 4T static storage and including a current-race sensing scheme,” IEEE J. Solid-State Circuits, vol. 38, pp.155–158, Jan. 2003.
[ 6 ] I. Arsovski and A. Sheikholeslami, “A mismatch-dependent power allocation technique for match-line sensing in content-addressable memories,” IEEE J. Solid-State Circuits, vol. 38, pp. 1958–1966, Nov. 2003.
[ 7 ] C. A. Zukowski and S. Y. Wang, “Use of selective precharge for lowpower content-addressable memories,” in Proc. IEEE Int. Symp. Circuits and Systems, vol. 3, pp. 1788–1791, 1997.
[ 8 ] R. Min, W. B. Jone; Y. Hu, “Phased tag cache: an efficient low power cache system,” Circuits and Systems, 2004. ISCAS '04. Proceedings of the 2004 International Symposium, Vol. 2, 23-26, Page(s): II - 805-8, May 2004.
[ 9 ] C. S. Lin, J. C. Chang, and B. D. Liu, “A Low-Power Pre-computation-Based Fully Parallel Content-Addressable Memory,” IEEE Journal of Solid-State Circuits, Vol. 38, pp. 654–622, April 2003.
[ 10 ] C. Y. WU, S. J. Ruan, C. K. Cheng and M. B. Lin, “A New Block-XOR Precomputation-Based CAM Design for low-power embedded system,” to appear in Proceedings of 12th IEEE International Conference on Electronics, Circuits and Systems Gammarth, Tunisia.
[ 11 ] F. Shafai et al., “Fully parallel 30-MHz 2.5-Mb CAM,” IEEE J. Solid-State Circuits, vol. 33, no. 11, pp. 1690–1998, Nov. 1998.
[ 12 ] Y. L. Hsiao et al., “Power modeling and low-power design of content addressable memories,” in Proc. IEEE Int. Symp. Circuits and Systems, vol. 4, 2001, pp. 926–929.
[ 13 ] P. Lin et al., “A 1-V 128-kb four-set-associative CMOS cache memory using wordline-oriented tag compare (WLOTC) structure with content addressable memory (CAM) 10-transistor tag cell,” IEEE J. Solid-State Circuits, vol. 36, no. 4, pp. 666–676, Apr. 2001.
[ 14 ] Nanosim Reference Guide, Synopsys, Mar. 2002.