簡易檢索 / 詳目顯示

研究生: 邱錦洋
Chin-Yang Chiu
論文名稱: 提升相變化記憶體可靠度之適應性 TMR 容錯技術
Adaptive Triple Modular Redundancy Technique for Reliability Enhancement of Phase Change Memory
指導教授: 呂學坤
Shyue-Kung Lu
口試委員: 李進福
Jin-Fu Li
洪進華
Jin-Hua Hong
黃樹林
Shu-Lin Hwang
王乃堅
Nai-Jian Wang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 75
中文關鍵詞: 相變化記憶體適應性可靠度容錯故障遮蔽
外文關鍵詞: TMR
相關次數: 點閱:297下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

  由於科技的迅速發展,為人類帶來了更方便的生活。如消費性電子產品,已經走進了人類的日常生活,並且隨處可見。而世代不斷的演進,資料的儲存一直都是非常重要的環節,故人類對於儲存裝置的需求,也隨著世代的演進而逐漸提升。其中相變化記憶體被視為下個世代的主流記憶體候選之一,它具有許多的優點,如非揮發性以及更高的可擴展性等等。然而影響相變記憶體壽命的重要因素在於寫入耐久度的限制,一旦相變化記憶體的寫入次數超越了記憶體元件本身應有的物理限制後,相變化材料便會與加熱器產生分離,便再也無法使用電流加熱的方式來改變相變化材料的狀態,因而產生永久性錯誤,從此終止相變化記憶體的壽命。

  於是,本論文提出了相變化記憶體之適應性 TMR 容錯技術,在每組編碼字裡僅僅配置一個同位位元,編碼字裡的資料再分成數個單元。當以同位位元偵測到每組編碼字裡所出現的第一個錯誤時,該組編碼字的記憶體位址會紀錄於內容定址記憶體,利用故障遮蔽技術,避開該組編碼字裡的錯誤因資料寫入後而被激發的情況。若故障遮蔽技術無法避開錯誤的激發,該組故障編碼字的儲存模式會由 SLC (Single Level Cell) 轉換成 TLC (Triple Level Cell),再依照位移計數改變原資料的排列順序,重新以 TLC 模式寫回該組編碼字的資料。往後在讀取該組編碼字的資料時,亦會依照位移計數復原其資料的排列順序,便可利用 TMR 技術得到整組編碼字的正確資料。

  由於故障遮蔽技術及 TMR 技術的特性,受保護的每組編碼字,具有多個位元的錯誤容忍能力。此外,每組編碼字裡僅配置一個同位位元,並不會花費大量的硬體成本。實驗結果顯示,本論文技術僅付出了 5.9% 的硬體成本,便可達到 80% 以上的修復率及 96% 以上的良率。而使用本論文技術的相變化記憶體,即使經過了一百萬個小時,可靠度仍能維持於 52% 的水準。


Due to the rapid development of technology, it has brought more convenient life to human beings. For example, consumer electronics products have entered the daily life of human beings and are everywhere. The continuous evolution of the generations, the storage of data has always been a very important link, so the human demand for storage devices has gradually increased with the evolution of the generation. Among them, phase change memory is regarded as one of the mainstream memory candidates of the next generation. It has many advantages, such as non-volatility and higher scalability. However, an important factor affecting the lifetime of a phase change memory is the limitation of the write endurance. Once the number of writes of the phase change memory exceeds the physical limit of the memory component itself, the phase change material will be separated from the heater. Then, it is no longer possible to change the state of the phase change material by means of current heating, thus occurring a permanent fault, thereby terminating the lifetime of the phase change memory.

Therefore, this thesis proposes adaptive triple modular redundancy technique (ATMRT) of phase change memory, only one parity bit is configured in each codeword, and the data in the codeword is divided into several units. When the first permanent fault occurs in the codeword detected by the parity bit, the memory address of the codeword is recorded in the content-addressable memory (CAM), and the fault masking technique is used to avoid the permanent fault activited in the codeword by data written. If the fault masking technique cannot avoid the permanent fault activited, the storage mode of the faulty codeword will be converted into a TLC (Triple Level Cell) by SLC (Single Level Cell), and then the order of the original data will be changed according to the shift count, write back the data of the faulty codeword in TLC mode. When reading the data of the codeword in the future, the order of the data is also restored according to the shift count, and the correct data of the codeword can be obtained by using TMR technique.

Due to the features of the fault masking technique and the TMR technique, each protected codeword has multiple bit error tolerance capabilities. In addition, only one parity bit is configured in each codeword, and does not cost a lot of hardware. Experimental results show that the proposed techniques require only 5.9% hardware overhead to achieve more than 80% repair rate and 96% yield. Moreover, the reliability can be maintained at 52% even after one million hours of operations.

誌謝 I 摘要 III Abstract IV 目錄 VI 圖目錄 IX 表目錄 XI 第一章 簡介 1 1.1 背景及動機 1 1.2 組織架構 4 第二章 相變化記憶體 5 2.1 相變化記憶體之基本架構 5 2.2 相變化記憶體之操作原理 6 2.3 相變化記憶體之儲存模式 8 2.4 相變化記憶體於可靠度方面的挑戰 9 第三章 相變化記憶體可靠度提升技術 10 3.1 錯誤更正碼 (Error Correction Code) 技術 10 3.1.1 漢明碼 (Hamming Code) 12 3.1.2 修正漢明碼 (Modified Hamming Code) 14 3.2 錯誤避免 (Fault Avoidance) 技術 15 3.2.1 損耗均衡 (Wear Leveling) 15 3.2.2 減少寫入 (Write Reduction) 17 3.2.3 降低熱干擾 18 3.3 備用資源為基礎 (Redundancy-Based) 之修復技術 19 3.3.1 動態資料分區 19 3.3.2 錯誤修正指標 (Error Correction Pointer) 20 3.3.3 資料位址重映射 21 3.3.4 其他備用資源修復技術 23 第四章 相變化記憶體之適應性 TMR 容錯技術 24 4.1 故障遮蔽技術 24 4.2 TMR (Triple Modular Redundancy) 技術 26 4.3 適應性 TMR 容錯技術 31 4.3.1 寫入操作流程 34 4.3.2 讀取操作流程 35 4.3.3 寫入操作流程範例 36 4.3.4 讀取操作流程範例 38 4.4 故障編碼字之錯誤分散演算法 41 4.5 硬體架構設計 45 第五章 實驗結果及數據分析 52 5.1 修復率分析 52 5.1.1 瑕疵分佈與故障模型之設定 52 5.1.2 評估修復率之模擬器 54 5.1.3 修復率模擬結果 55 5.2 良率分析 57 5.3 可靠度分析 61 5.3.1 可靠度評估模型 61 5.3.2 可靠度模擬結果 64 5.4 硬體成本分析 66 5.4.1 硬體成本估計模型 66 5.4.2 硬體成本分析結果 68 5.5 超大型積體電路實現 69 第六章 結論與未來展望 71 6.1 結論 71 6.2 未來展望 71 參考文獻 72

[1] M. K. Qureshi, V. Srinivasan, and J. A. Rivers, “Scalable High Performance Main Memory System Using Phase-Change Memory Technology,” in Proc. Int’l Symp. on Computer Architecture (ISCA), pp. 24–33, June 2009.
[2] K. Kim, “Technology for Sub-50 nm DRAM and NAND Flash Manufacturing,” in Proc. IEEE Int’l Electron Devices Meeting (IEDM), pp. 323–326, Dec. 2005.
[3] B. C. Lee, E. Ipek, O. Mutlu, and D. Burger, “Architecting Phase Change Memory as A Scalable DRAM Alternative,” in Proc. Int’l Symp. on Computer Architecture (ISCA), pp. 2–13, June 2009.
[4] P. Zhou, B. Zhao, J. Yang, and Y. Zhang, “A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology,” in Proc. Int’l Symp. on Computer Architecture (ISCA), pp. 14–23, June 2009.
[5] H. Pozidis, “High-Density Data Storage in Phase-Change Memory,” The European Research Consortium for Informatics and Mathematics (ERCIM) News, vol. 106, no. 7, pp. 58–59, July 2016.
[6] R. W. Hamming, “Error Detecting and Error Correcting Codes,” The Bell System Technical Journal, vol. 71, no. 4, pp. 147–160, Apr. 1950.
[7] M. K. Qureshi, J. Karidis, M. Franceschini, V. Srinivasan, L. Lastras, and B. Abali, “Enhancing Lifetime and Security of PCM-Based Main Memory With Start-Gap Wear Leveling,” in Proc. Int’l Symp. on Microarchitecture (MIRCO), pp. 14–23, Dec. 2009.
[8] N. H. Seong, D. H. Woo, and H. S. Lee, “Security Refresh: Prevent Malicious Wear-Out and Increase Durability for Phase-Change Memory with Dynamically Randomized Address Mapping,” in Proc. Int’l Symp. on Computer Architecture (ISCA), pp. 383–394, June 2010.
[9] H. Yu and Y. Du, “Increasing Endurance and Security of Phase-Change Memory with Multi-Way Wear-Leveling,” IEEE Trans. Computers, vol. 63, no. 5, pp. 1157–1168, May 2014.
[10] F. Huang, D. Feng, W. Xia, W. Zhou, Y. Zhang, M. Fu, C. Jiang, and Y. Zhou,“Security RBSG: Protecting Phase Change Memory with Security-Level Adjustable Dynamic Mapping,” in Proc. Int’l Parallel and Distributed Processing Symp. (IPDPS), pp. 1081–1090, May 2016.
[11] B. D. Yang, J. E. Lee, J. S. Kim, J. Cho, S. Y. Lee, and B. G. Yu, “A Low Power Phase-Change Random Access Memory Using a Data-Comparison Write Scheme,” in Proc. IEEE Int’l Symp. on Circuits and Systems, pp. 3014–3017, May 2007.
[12] S. Cho and H. Lee, “Flip-N-Write: A Simple Deterministic Technique to Improve PRAM Write Performance, Energy and Endurance,” in Proc. Int’l Symp. on Microarchitecture (MIRCO), pp. 347–357, Dec. 2009.
[13] Y. J. Lin, C. L. Yang, H. P. Li and C. Y. M. Wang, “A Buffer Cache Architecture for Smartphones with Hybrid DRAM/PCM Memory,” in Proc. Non-Volatile Memory System and Applications Symp. (NVMSA), pp. 1–6, Aug. 2015.
[14] J. W. Hsieh and Y. H. Kuan, “DCCS: Double Circular Caching Scheme for DRAM/PRAM Hybrid Cache,” IEEE Trans. Computers, vol. 64, no. 11, pp. 3115–3127, Nov. 2015.
[15] L. Jiang, Y. Zhang, and J. Yang, “Mitigating Write Disturbance in Super Dense Phase Change Memories,” in Proc. Int’l Conf. on Dependable Systems and Networks (DSN), pp. 216–227, June 2014.
[16] N. H. Seong, D. H. Woo, V. Srinivasan, J. Rivers, and H. H. Lee, “SAFER: Stuck-at-Fault Error Recovery for Memories,” in Proc. Int’l Symp. on Microarchitecture (MIRCO), pp. 115–124, Dec. 2010.
[17] J. Fan, S. Jiang, J. Shu, Y. Zhang, and W. Zhen, “Aegis: Partitioning Data Block for Efficient Recovery of Stuck-At-Faults in Phase Change Memory,” in Proc. Int’l Symp. on Microarchitecture (MIRCO), pp. 433–444, Dec. 2013.
[18] S. Schechter, G. H. Loh, K. Strauss, and D. Burger, “Use ECP, Not ECC, for Hard Failures in Resistive Memories,” in Proc. Int’l Symp. on Computer Architecture (ISCA), pp. 141–152, June 2010.
[19] D. H. Yoon, N. Muralimanohar, J. Chang, P. Ranganathan, N. P. Jouppi, and M. Erez, “FREE-p: Protecting Non-Volatile Memory Against both Hard and Soft Errors,” in Proc. Int’l Symp. on High Performance Computer Architecture (HPCA), pp. 466–477, Feb. 2011.
[20] M. K. Qureshi, “Pay-As-You-Go: Low-Overhead Hard-Error Correction for Phase Change Memories,” in Proc. Int’l Symp. on Microarchitecture (MIRCO), pp. 318–328, Dec. 2011.
[21] M. Asadinia, M. Arjomand, and H. S-Azad, “OD3P: On-Demand Page Paired PCM,” in Proc. Design Automation Conf. (DAC), pp. 1–6, June 2014.
[22] K. Pagiamtzis and A. Sheikholeslami, “Content-Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey,” IEEE Journal of Solid-State Circuits, vol. 41, no. 3, pp. 712–727, Mar. 2006.
[23] N. Papandreou, H. Pozidis, A. Pantazi, A. Sebastian, M. Breitwischt, C. Lamt, and E. Eleftheriou, “Programming Algorithms for Multilevel Phase-Change Memory,” in Proc. IEEE Int’l Symp. on Circuits and Systems, pp. 329–332, May 2011.
[24] A. Petropoulos and T. Antonakopoulos, “Hardware Emulation of Phase Change Memory,” in Proc. Panhellenic Conf. on Electronics and Telecommunications (PACET), pp. 1–4, Nov. 2017.
[25] L. Shi, R. Zhao, and T. C. Chong, “Phase Change Random Access Memory,” IEEE Developments in Data Storage: Materials Perspective, pp. 277–296, 2012.
[26] W. Zhang and T. Li, “Helmet: A Resistance Drift Resilient Architecture for Multi-Level Cell Phase Change Memory System,” in Proc. Int’l Conf. on Dependable Systems & Networks (DSN), pp. 197–208, June 2011.
[27] R. Wang, L. Jiang, Y. Zhamg, L. Wang, and J. Yang, “Exploit Imbalanced Cell Writes to Mitigate Write Disturbance in Dense Phase Change Memory,” IEEE Journal of Solid-State Circuits, vol. 41, no. 3, pp. 1–6, June 2015.

[28] E. Ipek, J. Condit, E. Nightingale, D. Burger, and T. Moscibroda, “Dynamically Replicated Memory: Building Resilient Systems from Unreliable Nanoscale Memories,” in Proc. Int’l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 3–14, Mar. 2010.
[29] K. Furutani, K. Arimoto, H. Miyamoto, T. Kobayashi, K. Yasuda, and K. Mashiko, “A Built-In Hamming Code ECC Circuit for DRAMs,” IEEE Journal of Solid-State Circuits, vol. 24, no. 1, pp. 50–56, Feb. 1989.
[30] Ralph E. Kuehn, “Computer Redundancy: Design, Performance, and Future,” IEEE Trans. on Reliability, vol. R-18, no. 1, pp. 3–11, Feb. 1969.
[31] O. Anjaneyulu, T. Pradeep and C. V. Krishna Reddy, ”Design and Implementation of Reversible Logic Based Bidirectional Barrel Shifter,” in Proc. Int’l Conf. on Semiconductor Electronics (ICSE), pp. 490–494, Sep. 2012.
[32] T. Ban, L. A. d. B. Pradeep and C. V. Krishna Reddy, ”A Simple Fault-Tolerant Digital Voter Circuit in TMR Nanoarchitectures,” in Proc. IEEE Int’l NEWCAS Conf., pp. 269–272, June 2010.
[33] I. Koren and Z. Koren, “Defect Tolerance in VLSI Circuits: Techniques and Yield Analysis,” in Proc. IEEE, vol. 86, no.9, pp. 1817–1836, Sep. 1998.
[34] R. F. Huang, J. F. Li, J. C. Yeh, and C. W. Wu, “A Simulator for Evaluating Redundancy Analysis Algorithm of Repairable Embedded Memories,” in Proc. IEEE Int’l On-Line Testing Workshop (IOLTW), pp. 262–267, July 2002.

QR CODE