簡易檢索 / 詳目顯示

研究生: 陳俊致
Chun-Chih Chen
論文名稱: 低功率預先計算內容可定址記憶體之合成
Synthesis of Low Power PB-CAM
指導教授: 阮聖彰
Shanq-Jang Ruan
口試委員: 許孟超
Mon-Chau Shie
楊佳玲
Chia-Lin Yang
張延任
Yen-Jen Chang
陳宏明
Hung-Ming Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2006
畢業學年度: 94
語文別: 英文
論文頁數: 34
中文關鍵詞: 預先計算可定址記憶體低功率區塊選擇演算法受測程式
外文關鍵詞: low power, PB-CAM, Gate-Block-Selection Algorithm, benchmark
相關次數: 點閱:218下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於平行比對使得Content addressable memory (CAM)有快速的資料搜尋速度。因此Content addressable memory (CAM)被廣泛的應用在很多方面,例如:asynchronous transfer mode (ATM), communication networks, LAN bridges/switches , databases, lookup table , and tag directories in fully associative cache systems。然而平行比對總是消耗較多功率的。於是本篇論文提出了一個Gate-Block-Section Algorithm。此方法以Precomputation-Based Content Addressable Memory(PB-CAM)架構為基礎,透過gate block的選擇來減少平行比對次數。依據程式模擬的結果,本篇論文的方法可以減少19.0%~26.6%的平行比對次數。在實驗過程中,使用TSMC 0.18μm,1.8V。為求精準,本論文使用Synopsys Nanosim量測功率。由實驗結果顯示,本篇論文的方法可以節省3.1%~13.3%的功率。這篇文章主要的目的是提供理論和實際的證據驗證Gate-Block-Selection Algorithm可以進一步針對不同的benchmark節省功率。因此我們的Gate-Block-Selection Algorithm非常適用於嵌入式系統。


    Content addressable memory (CAM) is a major device in asynchronous transfer mode
    (ATM), communication networks, LAN bridges/switches , databases, lookup table ,
    and tag directories in fully associative cache systems, due to its high-speed parallel
    data searching operation. Hence, the operation of CAM requires an enormous number
    of comparison operations. Obviously, these comparison operations consumes a large
    amount of power.
    In this paper, we propose a Gate-Block-Selection algorithm to construct a low
    power PB-CAM to reduce the comparison operation number. By our theoretical
    analysis, we can reduce 19.0 % at least, 26.6 % at most. the total number of the
    comparison operation of our Gate-Block-Selection algorithm is much less than that
    of the Block-XOR approach. Moreover, we implemented both the Block-XOR PBCAM
    and our Gate-Block-Selection PB-CAM in the TSMC 0.18-μm digital CMOS
    process technology and measure the power consumption by Nanosim. Compared to
    Block-XOR approach, the experimental result shows that our Gate-Block-Selection
    PB-CAM comsumes 3.1% at least and 13.3% at most less power than Block-XOR
    does.
    The major contribution of our paper is that we provided theoretical and practical
    proof to verify that our gate-block-selection approach can further reduce the power
    consumption to fit different benchmark. Therefore, our Gate-Block-Selection PBCAM
    is very suitable for embedded systems.

    Table of Contents iv Abstract v 1 Introduciton 1 2 Previous Work and Observation 7 3 Problem Formulation 10 3.1 Gate-Block-Selection Algorithm . . . . . . . . . . . . . . . . . . . . . 11 3.2 Valid Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 CAMCIRCUITDESIGN 17 5 Experimental Results 23 5.1 Experimental Environment . . . . . . . . . . . . . . . . . . . . . . . . 23 5.2 Theoretical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 5.3 Power ConsumptionMeasured by Nanosim . . . . . . . . . . . . . . . 25 6 Conclusions 27 Bibliography 29

    [1] Nanosim reference guide, Synopsys, Mar 2002.
    [2] I. Arsovski, T. Chandler, and A. Sheikholeslami, A ternary content-addressable
    memory (tcam) based on 4t static storage and including a current-race sensing
    scheme, IEEE J. Solid-State Circuits, vol. 38, Jan 2003, pp. 155–158.
    [3] I. Arsovski and A. Sheikholeslami, A mismatch-dependent power allocation technique
    for match-line sensing in content-addressable memories, IEEE J. Solid-
    State Circuits, vol. 38, Nov 2003, pp. 1958–1966.
    [4] C. Brad, G. Dirk, and E. Joel, Predictive sequential associative cache, in Proc.Of
    the 2nd International Symposium on High-Performance Computer Architecture,
    Feb 1996, pp. 244–263.
    [5] D. Burger and T. M. Austin, The simplescalar tool set version 2.0,
    http://www.simplescalar.com, 1997.
    29
    30
    [6] J. H. Chang, H. Chao, and K. So, Cache design of a sub-micron cmos system/370,
    in Proc. of the 14th International Symposium on Computer Architecture, Jun
    1987, pp. 208–213.
    [7] Y. J. Chang, S. J. Ruan, and F. P. Lai, Design and analysis of low power cache
    using two-level filter scheme, IEEE Transactions on Very Large Scale Integration
    Systems, vol. 11, Aug 2003, pp. 568–580.
    [8] K. H. Cheng, C. H. Wei, and S. Y. Jiang, Static divided word matching line for
    low-power content addressable memory design, ISCAS ’04 Circuits and Systems,
    vol. II, May 2004, pp. 629–632.
    [9] K. Ghose and M. B. Kamble, Reducing power in superscalar processor caches
    using subbanking, multiple line buffers and bit-line segmentation, in Proc. of Int.
    Low Power Electronics and Design Symp, 1999, pp. 70–75.
    [10] K. Ghose and M. B. Kemble, Energy efficient cache organizations for superscalar
    processors, Power-Driven Micro-architecture Workshop In Conjunction
    with ISCA98 in Bar-celona, June 1998.
    31
    [11] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B.
    Brown, Mibench: A free, commercially representative embedded benchmark suite,
    IEEE 4th Annual Workshop on Workload Characterization, Dec 2001.
    [12] N. B. I. Haji, C. Polychronopoulos, and G. Stamoulis, Architectural and compiler
    support for energy reduction in the memory hierarchy of high performance
    microprocessors, in Proc. of the 1998 International Symposium on Low Power
    Electronics and Design, Aug 1998, pp. 70–75.
    [13] W. B. Jone, Y. Hu, and R. Min, Phased tag cache: an efficient low power cache
    system, Circuits and Systems, 2004, ISCAS ’04, and in Proc. of the 2004 International
    Symposium, vol. II, May 2004, pp. 805–808.
    [14] J. Kin, M. Gupta, and W. H. Mangione-Smith, The filter cache: An energy
    efficient memory structure, in Proc. of the 30th Annual International Symposium
    on Microarchitecture, Dec 1997, pp. 184–193.
    [15] R. E. Lessler, A. Lebeck R. Joss, and M. D. Hill, Inexpensive implementations
    of set-associativity, in Proc. of the 16th International symposium on computer
    architecture, 1989, pp. 131–139.
    32
    [16] C. S. Lin, J. C. Chang, and B. D. Liu, A low-power pre-computation-based
    fully parallel content-addressable memory, IEEE Journal of Solid-State Circuits,
    vol. 38, Apr 2003, pp. 654–662.
    [17] P. M. Miller, A. R. Hurson, and R. H. Hettmansperger, Modular scheme for
    designing associative memorys, Comp. Syst. Sci. Eng, vol. 31, Jul 1993, pp. 166–
    181.
    [18] R. Min, W. B. Jone, and Y. Hu, Location cache: a low-powre l2 cache system,
    ISLPED ’04. Proceedings of the 2004 International Symposium, Aug 2004,
    pp. 120–125.
    [19] H. Miyatake, M. Tanaka, and Y. Mori, A design for high-speed low power cmos
    fully parallel content-addressable memory macros, IEEE J. Solid-State Circuits,
    vol. 36, Jun 2001, pp. 956–968.
    [20] K. Nogami, T. Sakurai, K. Sakaue, Y. Miyazawa, S. Tanaka, T. Hiruta, K. Katoh,
    T. Takayanagi, T. Hirotori, T. Itoh, and M. Uchida, A 9-ns hit-delay 32-kb cache
    macro for high-speed risc, IEEE J. Solid-state Circuits, vol. 25, Feb 1990, pp. 100–
    108.
    33
    [21] T. Ogura, M. Nakanishi, T. Nakabayshi, and R. Kasai, A 336kb contentaddressable
    memory for highly parallel image processing, in Proc IEEE Custom
    Integrated Circuit Conf, 1996, pp. 13.4.1–13.4.4.
    [22] K. Schultz, F. Shafai, R. Gibson, A. Bluschke, and D. Somppi, Fully parallel
    25-mhz 2.5-mb cam, IEEE ISSCC Dig. Tech. Papers, 1998, pp. 325–333.
    [23] C. L. Su and A. M. Despain, Cache design trade-offs for power and performance
    optimization: A case study, in Proc. of the International Symposium on Low
    Power Design, Apr 1995, pp. 69–74.
    [24] N. H. E. Waste and K. Eshraghian, Principles of cmos vlsi design: A system
    perspective reading, Addison Wesley, 1985.
    [25] C. Y. Wu, S. J. Ruan, C. K. Cheng, and M. B. Lin, A new block-xor
    precomputation-based cam design for low-power embedded system, in Proceedings
    of 12th IEEE International Conference on Electronics, Circuits and Systems
    Gammarth, Dec 2005.
    [26] H. Yamada, Y. Murata, T. Maeda, R. Ikeda, K. Motohashi, and K. Takahashi,
    34
    Real-time string search engine vlsi for 800-mbit/s lans, IEEE 1988 Custom Integrated
    Circuits Conf, 1988, pp. 21.6.1–21.6.4.
    [27] K. C. Yeager, The mips r10000 superscalar micropro-cessor, IEEE Micro, vol. 16,
    Apr 1996, pp. 28–40.
    [28] C. A. Zukowski and S. Y. Wang, Use of selective precharge for lowpower contentaddressable
    memories, in Proc. IEEE Int. Symp. Circuits and Systems, vol. 3,
    1997, pp. 1788–1791.

    無法下載圖示 全文公開日期 2008/06/07 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 2011/06/07 (國家圖書館:臺灣博碩士論文系統)
    QR CODE