低功率預先計算內容可定址記憶體之合成｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳俊致 Chun-Chih Chen
論文名稱：	低功率預先計算內容可定址記憶體之合成 Synthesis of Low Power PB-CAM
指導教授：	阮聖彰 Shanq-Jang Ruan
口試委員:	許孟超 Mon-Chau Shie 楊佳玲 Chia-Lin Yang 張延任 Yen-Jen Chang 陳宏明 Hung-Ming Chen
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2006
畢業學年度：	94
語文別：	英文
論文頁數：	34
中文關鍵詞：	預先計算可定址記憶體、低功率、區塊選擇演算法、受測程式
外文關鍵詞：	low power, PB-CAM, Gate-Block-Selection Algorithm, benchmark
相關次數：	點閱：218 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

由於平行比對使得Content addressable memory (CAM)有快速的資料搜尋速度。因此Content addressable memory (CAM)被廣泛的應用在很多方面，例如：asynchronous transfer mode (ATM), communication networks, LAN bridges/switches , databases, lookup table , and tag directories in fully associative cache systems。然而平行比對總是消耗較多功率的。於是本篇論文提出了一個Gate-Block-Section Algorithm。此方法以Precomputation-Based Content Addressable Memory(PB-CAM)架構為基礎，透過gate block的選擇來減少平行比對次數。依據程式模擬的結果，本篇論文的方法可以減少19.0%~26.6%的平行比對次數。在實驗過程中，使用TSMC 0.18μm，1.8V。為求精準，本論文使用Synopsys Nanosim量測功率。由實驗結果顯示，本篇論文的方法可以節省3.1%~13.3%的功率。這篇文章主要的目的是提供理論和實際的證據驗證Gate-Block-Selection Algorithm可以進一步針對不同的benchmark節省功率。因此我們的Gate-Block-Selection Algorithm非常適用於嵌入式系統。

Content addressable memory (CAM) is a major device in asynchronous transfer mode
(ATM), communication networks, LAN bridges/switches , databases, lookup table ,
and tag directories in fully associative cache systems, due to its high-speed parallel
data searching operation. Hence, the operation of CAM requires an enormous number
of comparison operations. Obviously, these comparison operations consumes a large
amount of power.
In this paper, we propose a Gate-Block-Selection algorithm to construct a low
power PB-CAM to reduce the comparison operation number. By our theoretical
analysis, we can reduce 19.0 % at least, 26.6 % at most. the total number of the
comparison operation of our Gate-Block-Selection algorithm is much less than that
of the Block-XOR approach. Moreover, we implemented both the Block-XOR PBCAM
and our Gate-Block-Selection PB-CAM in the TSMC 0.18-μm digital CMOS
process technology and measure the power consumption by Nanosim. Compared to
Block-XOR approach, the experimental result shows that our Gate-Block-Selection
PB-CAM comsumes 3.1% at least and 13.3% at most less power than Block-XOR
does.
The major contribution of our paper is that we provided theoretical and practical
proof to verify that our gate-block-selection approach can further reduce the power
consumption to fit different benchmark. Therefore, our Gate-Block-Selection PBCAM
is very suitable for embedded systems.

Table of Contents iv
Abstract v
Introduciton 1
Previous Work and Observation 7
Problem Formulation 10
1 Gate-Block-Selection Algorithm . . . . . . . . . . . . . . . . . . . . . 11
2 Valid Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
CAMCIRCUITDESIGN 17
Experimental Results 23
1 Experimental Environment . . . . . . . . . . . . . . . . . . . . . . . . 23
2 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3 Power ConsumptionMeasured by Nanosim . . . . . . . . . . . . . . . 25
Conclusions 27
Bibliography 29

                                

[1] Nanosim reference guide, Synopsys, Mar 2002.
[2] I. Arsovski, T. Chandler, and A. Sheikholeslami, A ternary content-addressable
memory (tcam) based on 4t static storage and including a current-race sensing
scheme, IEEE J. Solid-State Circuits, vol. 38, Jan 2003, pp. 155–158.
[3] I. Arsovski and A. Sheikholeslami, A mismatch-dependent power allocation technique
for match-line sensing in content-addressable memories, IEEE J. Solid-
State Circuits, vol. 38, Nov 2003, pp. 1958–1966.
[4] C. Brad, G. Dirk, and E. Joel, Predictive sequential associative cache, in Proc.Of
the 2nd International Symposium on High-Performance Computer Architecture,
Feb 1996, pp. 244–263.
[5] D. Burger and T. M. Austin, The simplescalar tool set version 2.0,
http://www.simplescalar.com, 1997.
29
30
[6] J. H. Chang, H. Chao, and K. So, Cache design of a sub-micron cmos system/370,
in Proc. of the 14th International Symposium on Computer Architecture, Jun
1987, pp. 208–213.
[7] Y. J. Chang, S. J. Ruan, and F. P. Lai, Design and analysis of low power cache
using two-level filter scheme, IEEE Transactions on Very Large Scale Integration
Systems, vol. 11, Aug 2003, pp. 568–580.
[8] K. H. Cheng, C. H. Wei, and S. Y. Jiang, Static divided word matching line for
low-power content addressable memory design, ISCAS ’04 Circuits and Systems,
vol. II, May 2004, pp. 629–632.
[9] K. Ghose and M. B. Kamble, Reducing power in superscalar processor caches
using subbanking, multiple line buffers and bit-line segmentation, in Proc. of Int.
Low Power Electronics and Design Symp, 1999, pp. 70–75.
[10] K. Ghose and M. B. Kemble, Energy efficient cache organizations for superscalar
processors, Power-Driven Micro-architecture Workshop In Conjunction
with ISCA98 in Bar-celona, June 1998.
31
[11] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B.
Brown, Mibench: A free, commercially representative embedded benchmark suite,
IEEE 4th Annual Workshop on Workload Characterization, Dec 2001.
[12] N. B. I. Haji, C. Polychronopoulos, and G. Stamoulis, Architectural and compiler
support for energy reduction in the memory hierarchy of high performance
microprocessors, in Proc. of the 1998 International Symposium on Low Power
Electronics and Design, Aug 1998, pp. 70–75.
[13] W. B. Jone, Y. Hu, and R. Min, Phased tag cache: an efficient low power cache
system, Circuits and Systems, 2004, ISCAS ’04, and in Proc. of the 2004 International
Symposium, vol. II, May 2004, pp. 805–808.
[14] J. Kin, M. Gupta, and W. H. Mangione-Smith, The filter cache: An energy
efficient memory structure, in Proc. of the 30th Annual International Symposium
on Microarchitecture, Dec 1997, pp. 184–193.
[15] R. E. Lessler, A. Lebeck R. Joss, and M. D. Hill, Inexpensive implementations
of set-associativity, in Proc. of the 16th International symposium on computer
architecture, 1989, pp. 131–139.
32
[16] C. S. Lin, J. C. Chang, and B. D. Liu, A low-power pre-computation-based
fully parallel content-addressable memory, IEEE Journal of Solid-State Circuits,
vol. 38, Apr 2003, pp. 654–662.
[17] P. M. Miller, A. R. Hurson, and R. H. Hettmansperger, Modular scheme for
designing associative memorys, Comp. Syst. Sci. Eng, vol. 31, Jul 1993, pp. 166–
181.
[18] R. Min, W. B. Jone, and Y. Hu, Location cache: a low-powre l2 cache system,
ISLPED ’04. Proceedings of the 2004 International Symposium, Aug 2004,
pp. 120–125.
[19] H. Miyatake, M. Tanaka, and Y. Mori, A design for high-speed low power cmos
fully parallel content-addressable memory macros, IEEE J. Solid-State Circuits,
vol. 36, Jun 2001, pp. 956–968.
[20] K. Nogami, T. Sakurai, K. Sakaue, Y. Miyazawa, S. Tanaka, T. Hiruta, K. Katoh,
T. Takayanagi, T. Hirotori, T. Itoh, and M. Uchida, A 9-ns hit-delay 32-kb cache
macro for high-speed risc, IEEE J. Solid-state Circuits, vol. 25, Feb 1990, pp. 100–
108.
33
[21] T. Ogura, M. Nakanishi, T. Nakabayshi, and R. Kasai, A 336kb contentaddressable
memory for highly parallel image processing, in Proc IEEE Custom
Integrated Circuit Conf, 1996, pp. 13.4.1–13.4.4.
[22] K. Schultz, F. Shafai, R. Gibson, A. Bluschke, and D. Somppi, Fully parallel
25-mhz 2.5-mb cam, IEEE ISSCC Dig. Tech. Papers, 1998, pp. 325–333.
[23] C. L. Su and A. M. Despain, Cache design trade-offs for power and performance
optimization: A case study, in Proc. of the International Symposium on Low
Power Design, Apr 1995, pp. 69–74.
[24] N. H. E. Waste and K. Eshraghian, Principles of cmos vlsi design: A system
perspective reading, Addison Wesley, 1985.
[25] C. Y. Wu, S. J. Ruan, C. K. Cheng, and M. B. Lin, A new block-xor
precomputation-based cam design for low-power embedded system, in Proceedings
of 12th IEEE International Conference on Electronics, Circuits and Systems
Gammarth, Dec 2005.
[26] H. Yamada, Y. Murata, T. Maeda, R. Ikeda, K. Motohashi, and K. Takahashi,
34
Real-time string search engine vlsi for 800-mbit/s lans, IEEE 1988 Custom Integrated
Circuits Conf, 1988, pp. 21.6.1–21.6.4.
[27] K. C. Yeager, The mips r10000 superscalar micropro-cessor, IEEE Micro, vol. 16,
Apr 1996, pp. 28–40.
[28] C. A. Zukowski and S. Y. Wang, Use of selective precharge for lowpower contentaddressable
memories, in Proc. IEEE Int. Symp. Circuits and Systems, vol. 3,
1997, pp. 1788–1791.

全文公開日期 2008/06/07 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期 2011/06/07 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文