簡易檢索 / 詳目顯示

研究生: 蔡文齊
Wen-Chi Tsai
論文名稱: 改善快閃記憶體可靠度之適應性容錯技術
Adaptive Fault-Tolerant Techniques for Improving Reliability of Flash Memory
指導教授: 呂學坤
Shyue-Kung Lu
口試委員: 呂學坤
Shyue-Kung Lu
李進福
Jin-Fu Li
黃樹林
Shu-Lin Hwang
王乃堅
Nai-Jian Wang
洪進華
Jin-Hua Hong
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 80
中文關鍵詞: 快閃記憶體容錯錯誤更正碼良率可靠度
外文關鍵詞: Flash memory, Fault tolerance, Error correction codes, Yield, Reliability
相關次數: 點閱:331下載:41
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於快閃記憶體具有高效能、低功耗與較好的抗震能力等特點,它被廣泛使用在筆記型電腦、手機、固態硬碟等消費性電子產品上。快閃記憶體利用儲存在浮閘上的電子數量多寡來儲存資料,而隨著製程與技術的進步,快閃記憶體一個細胞能儲存的位元數越來越多,使其雜訊邊界縮小,導致隨著細胞抹寫次數 (P/E Cycle) 的上升,記憶體的可靠度與耐久度嚴重下降。最常被用來解決這個問題的方法為使用錯誤更正碼技術,錯誤更正碼能有效對付快閃記憶體主要的故障:永久性故障與干擾性故障,但快閃記憶體不支援相同位置資料更新 (In-Place Update),因此編碼字內的故障數會隨時間累積,當一個編碼字內的故障數量超過其保護能力範圍時,該編碼字就無法被修復。
    因此,本論文提出適應性錯誤容忍技術來解決這些問題,本技術主要的想法為在錯誤更正碼保護能力固定的情況下,我們提出修正餘度 (Correction Slack) 的概念,藉由計算錯誤更正碼剩餘修復能力的數量,只針對錯誤更正碼剩餘修正能力較低的編碼字,藉由記錄錯誤位置進行額外保護,並設定累積錯誤數已超過 ECC 保護能力的編碼字所在的頁為關鍵頁,對其進行定期的虛擬錯誤刷除,以防止故障累積超過其保護能力而無法被修復。
    本研究實現了適應性錯誤容忍技術的硬體電路,並在一個128 MB 的快閃記憶體上分析了本方法的修復率、良率、可靠度以及硬體成本。根據實驗的結果,在配備 48 組 CAM 欄位、編碼字長度為 256 位元時,我們的修復率相較只使用錯誤更正碼提升達 32%,在較差的情況下良率也能維持在 93 % 以上,可靠度亦能在 106 小時後維持 91% 的水準,而所使用的硬體成本相較整個快閃記憶體只要0.12 % 以下。


    Due to the inherent features of high performance, low power consumption and shock resistance, flash memories are widely used in consumer products such as notebooks, smart phones, SSDs, and so on. The data stored in a flash memory cell is determined by the number of charges programmed on its floating gate. Since the number of bits can be stored in a single cell keeps increasing continuously, the noise margin on each Vth level of the multiple-level cells shrinks significantly. This will then lead to reliability and endurance issues when the program and erase counts increase. To solve this problem, the most popularly used technique is error correction code (ECC). ECC can effectively conquer the main fault types of flash memories, including permanent faults and disturb faults. However flash memories don’t support in-place update. Therefore, we cannot correct the stored data within the flash memory cell. When the number of faulty bits in a codeword exceeds the ECC correction capacity, this codeword cannot be corrected successfully.
    Therefore, this thesis proposes adaptive fault tolerant techniques to solve this problem. The main idea is to introduce the concept of correction slack, which means the remaining ECC correction capacity. Our techniques only protect the codeword with less correction slack by recording the error locations determined by the ECC decoding circuitry. If a page containing a codeword with correction slack less than or equal to 0, this page will be marked as a critical page. To prevent the accumulation of faulty bits and then cause fail repair, we also propose virtual scrubbing technique which scrubs faulty bits periodically.
    The VLSI design of the proposed techniques and architectures are implemented. Moreover, the repair rate, yield, reliability, and hardware overhead of 128 MB flash memory are also analyzed. According to experimental results with 48 CAM entries and 256 bits per codeword, we can increase 32% of repair rate than merely using the conventional ECC technique. The yield can achieve up to 93% in the worst case. Furthermore, the reliability of a flash memory remains higher than 91% after 106 hours of normal operations. The hardware overhead for incorporating the proposed technique of a flash size flash memory is only 0.12%.

    誌謝 I 摘要 II Abstract III 目錄 V 圖目錄 VIII 表目錄 X 第一章 簡介 1 1.1 背景及動機 1 1.2 組織架構 4 第二章 快閃記憶體的基本工作原理與應用 5 2.1 快閃記憶體的基本概念 5 2.1.1 基本動作 6 2.1.2 非及型快閃記憶體 8 2.1.3 非或型快閃記憶體 9 2.2 固態硬碟 10 2.2.1 邏輯/實體位址映射 11 2.2.2 壞區塊管理 12 2.2.3 垃圾回收 12 2.2.4 損耗均衡 13 第三章 快閃記憶體的測試與修復技術 15 3.1 功能性故障模型 15 3.1.1 常見記憶體的通用故障模型 15 3.1.2 快閃記憶體的特定故障模型 17 3.2 快閃記憶體的測試 19 3.2.1 測試流程 19 3.2.2 測試演算法 20 3.2.3 內建自我測試 21 3.3 錯誤更正碼 23 3.3.1 漢明碼 24 3.3.2 BCH 碼 24 第四章 基於冗餘位元之適應性錯誤容忍技術 28 4.1 適應性錯誤容忍技術之基本概念 28 4.2 適應性冗餘位元修正技術 30 4.2.1 適應性冗餘位元修正技術流程 31 4.2.2 適應性冗餘位元修正範例 32 4.3 虛擬錯誤刷除技術 35 4.3.1 虛擬錯誤刷除流程 36 4.3.2 虛擬錯誤刷除範例 37 4.4 適應性錯誤容忍技術之硬體架構 38 4.4.1 錯誤位置定址記憶體模組 (Error Location CAM) 40 4.4.2 位元錯誤修正模組 (Bit Error Corrector) 41 4.4.3 錯誤效應刷除模組 (Scrubbing Module) 42 第五章 實驗結果 44 5.1 瑕疵分佈與瑕疵模型的設定 44 5.1.1 瑕疵分布 44 5.1.2 瑕疵模型 45 5.2 修復率分析 46 5.3 良率分析 49 5.4 可靠度分析 53 5.4.1 可靠度模型 53 5.4.2 可靠度模擬結果 54 5.5 硬體成本分析 55 5.6 電路實現 59 第六章 結論與未來展望 62 6.1 結論 62 6.2 未來展望 62 參考文獻 63

    [1] Semico Res. Corp., Phoenix, AZ, USA, ASIC IP Rep., 2007. [Online]. Available: http://www.semico.com/content/semico-systemschip-%E2%80%93-braver-new-world
    [2] “International Technology Roadmap for Semiconductors (ITRS)”, Semiconductor Industry Association (SIA), 2005.
    [3] T. U. Youn, “Reliability issue of 20 nm MLC NAND flash,” in Proc. IEEE Int’l Rel. Phys. Symp. (IRPS), pp. 3B.2.1-3B.2.4, Apr. 2013.
    [4] L. P. Chang, “A hybrid approach to NAND-flash-based solid-state disks,” IEEE Trans. Computers, vol. 59, no. 10, pp. 1337-1349, Oct. 2010.
    [5] N. Mielke, T. Marquart, N. Wu, J. Kessenich, H. Belgal, E. Schares, F. Trivedi, E. Goodness, and L. R. Nevill, “Bit error rate in NAND flash memories,” in Proc. IEEE Int’l Rel. Phys. Symp. (IRPS), pp. 9-19, Apr. 2008.
    [6] Y. Y. Hsiao, C. H. Chen, and C. W. Wu, “Built-in self-repair schemes for flash memory,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 29, no. 8, pp. 1243-1256, Aug. 2010.
    [7] O. Ginez, J. M. Portal, and H. Aziza, “Reliability issues in flash memories: an on-line diagnosis and repair scheme for word line drivers,” in Proc. Int’l Workshop on Mixed-Signals, Sensors, and Systems Test, pp. 1-6, June 2008.
    [8] Y. Y. Hsiao, C. H. Chen, and C. W. Wu, “A built-in self-repair scheme for NOR-type flash memory,” in Proc. IEEE VLSI Test Symp. (VTS), Berkeley, pp. 114-119, Apr. 2006.
    [9] S. Lin and D. J. Costello, Error control coding, 2nd ed., Englewood Cliffs, NJ: Pearson Prentice Hall, 2014.
    [10] G. Forney, “On decoding BCH codes,” IEEE Trans. Information Theory, vol. 11, no. 4, pp. 549-557, Oct. 1965.
    [11] R. Motwani, Z. Kwok, and S. Nelson, “Low density parity check (LDPC) codes and the need for stronger ECC,” in Flash Memory Summit, 2011 [Online]. Available: http://www.flashmemorysummit.com/
    [12] S. Tanakamaru, Y. Yanagihara, and K. Takeuchi, “Error prediction LDPC and error recovery schemes for highly reliable Solid-State Drives (SSDs),” IEEE J. Solid-State Circuits, vol. 48, no. 11, pp. 2920-2933, Nov. 2013.
    [13] L Yuan, H Liu, P Jia, and Y. Yang, “Reliability-based ECC system for adaptive protection of NAND flash memories,” in Proc. Int’l Conference on Communication Systems and Network Technologies (CSNT), pp. 897-902, 2015.
    [14] Y. Hu, N. Xiao, and X. Liu, “An elastic error correction code technique for NAND flash-based consumer electronic devices,” IEEE Trans. Consumer Electronics, vol. 9, no. 1, pp. 53-58, Apr. 2013.
    [15] Y. H. Chang, J. W. Hsieh, and T. W. Kuo, “Endurance enhancement of flash-memory storage systems: an efficient static wear leveling design,” in Proc. ACM/IEEE Design Automation Conference (DAC), June 2007.
    [16] J. Guo, D. Wang, and Z. Shao, “Data-pattern-aware error prevention technique to improve system reliability,” IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 25, no. 4, pp. 1433-1443, Apr. 2017.
    [17] S. M. Kang, and Y. Leblebici, CMOS digital integrated circuits:analysis and design, 3rd ed., McGraw Hill, 2003.
    [18] S. Aritome, Nand flash memory technologies, 1sd ed., Wiley-IEEE Press, 2015.
    [19] Micron Tech. Inc., QLC NAND technology, May 2018., [online] Available: https://www.micron.com/products/advanced-solutions/qlc-nand.
    [20] M. Qin, “Fractional bits-per-cell for NAND flash with low read latency,” in Proc. IEEE Global Communications Conference (GLOBECOM), Dec. 2017.
    [21] S. Liu, X. Zou, and B.Wang, “Program and read methods with offset in quad-level-cell NAND design,” in Proc. IEEE Int’l Conference on Electron Devices and Solid-State Circuits (EDSSC), Oct. 2017.
    [22] N. Agrawal, V. Prabhakaran, T. Wobber, J. D. Davis, M. Manasse, and R. Panigrahy, “Design tradeoffs for SSD performance,” in Proc. Annual Technical Conference (ATC), pp. 53-58, June 2008.
    [23] R. Chen, Z. Qin, Y. Wang, D.Liu, Z.Shao, and Y. Guan, “On-demand block-level address mapping in large-scale NAND flash storage systems,” IEEE Trans. Computers, vol. 64, no. 6, pp. 1729-1741, June 2015.
    [24] T. S. Chung, D. J. Park, S. Park, D. H. Lee, S. W. Lee, and H. J. Song, “A survey of flash translation layer,” Journal of Systems Architecture, vol. 55, no. 5-6, pp. 332-343, May 2009.
    [25] Intel Corp., Understanding the flash translation layer (FTL) specification 1998., [online] Available: http://developer.intel.com/.
    [26] A. Ban, “Flash file system,” US Patent, No. 5,404,485, 1995.
    [27] H. Kim, S. Lee, “A new flash memory management for flash storage system,” in Proc. Computer Software and Applications Conference, pp. 284-289, Oct. 1999.
    [28] C. Park, W. Cheon, Y. Lee, M. S. Jung, W. Cho, and H. Yoon, “A re-configurable FTL architecture for NAND flash based applications,” in Proc. IEEE/IFIP Int’l Workshop on Rapid System Prototyping, pp. 202-208, Sep. 2010.
    [29] J. Kim, J. M. Kim, S. H. Noh, S. L. Min, and Y. Cho, “A space-efficient flash translation layer for compact flash systems,” IEEE Trans. Consumer Electronics, vol. 48, no. 2, pp. 366-375, May 2002.
    [30] Micron Tech. Inc., Bad block management in NAND flash memory, Oct. 2010., [online] Available: https://www.micron.com/resource-details/8e059ff2-fb4f-4e05-974c-e205226d2318.
    [31] Intel Corp., Intel solid-state drives in server storage applications, 2014., [online] Available: https://www.intel.com/content/dam/www/public/us/en/documents/whit e-papers/ssd-server-storage-applications-paper.pdf
    [32] C. T. Huang, J. R. Huang, C. F. Wu, C. W. Wu, and T. Y. Chang, “A programmable BIST core for embedded DRAM,” IEEE Design & Test of Computers, vol. 16, no. 1, pp. 59-70, Jan.-Mar. 1999.
    [33] IEEE 1005 standard definitions and characterization of floating gate semiconductor arrays, Piscataway, NJ: IEEE Standards Dept., 1999.
    [34] J. C. Yeh, K. L. Cheng, Y. F. Chou, and C. W. Wu, “Flash memory testing and built-in self-diagnosis with march-like test algorithms,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 6, pp. 1101-1113, June 2007.
    [35] C. T. Huang, J. C. Yeh, Y. Y. Shih, R. F. Huang, and C. W. Wu, “On test and diagnostics of flash memories,” in Proc. IEEE Asian Test Symp. (ATS), pp. 260-265, Jan. 2005.
    [36] A. J. V. D. Goor, “Using march tests to test SRAMs,” IEEE Design & Test of Computers, vol. 10, no. 1, pp. 8-14, Mar. 1993.
    [37] R. Nair, S. M. Thatte, and J. A. Abraham, “Efficient algorithms for testing semiconductor random-access memories,” IEEE Trans. Computers, vol. C-27, no. 6, pp. 572-576, June 1978.
    [38] J. C. Yeh, K. L. Cheng, Y. F. Chou, and C. W. Wu, “Flash memory testing and built-in self-diagnosis with march-like test algorithms,” IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 26, no. 6, pp. 1101-1113, June 2007.
    [39] K. L. Cheng, J. C. Yeh, C. W. Wang, C. T. Huang, and C. W. Wu, “RAMSES-FT: A fault simulator for flash memory testing and diagnostics,” in Proc. IEEE VLSI Test Symp. (VTS), pp. 281-286, Apr. 2002.
    [40] R. W. Hamming, “Error detecting and error correcting codes,” Bell System Tech. J., vol. XXVI, no. 2, pp. 147-160, Apr. 1950.
    [41] R. Micheloni, R. Ravasio, A. Marelli, E. Alice, V. Altieri, A. Bovino, L. Crippa, E. Di Martino, L. D’Onofrio, A. Gambardella, E. Grillea, G. Guerra, D. Kim, C. Missiroli, I. Motta, A. Prisco, G. Ragone, M. Romano, M. Sangalli, P. Sauro, M. Scotti, and S. Won, “A 4Gb 2b/cell NAND flash memory with embedded 5b BCH ECC for 36MB/s system read throughput,” in Proc. IEEE Int’1 Solid-State Cir. Conf. (ISSCC), pp. 497-506, Feb. 2006.
    [42] X. Youzhi, “Implementation of Berlekamp-Massey algorithm without inversion,” IEE Proc. Communications, Speech and Vision, vol. 138, no. 3, pp. 138-140, June 1991.
    [43] Y. Sugiyama, M. Kasahara, S. Hirasawa, and T. Namekawa, “A method for solving key equation for decoding Goppa codes,” Information and Control, pp. 87-99, Jan. 1975.
    [44] Y. Chen and K. Parhi, “Small area parallel Chien search architectures for long BCH codes,” IEEE Trans. VLSI System, vol. 12, no. 5, pp. 545-549, May 2004.
    [45] Richard Bowles, “Memory resiliency,” Intel® Technology Journal, vol. 17, issue. 1, pp. 1-200, May 2013
    [46] J. Heidecker, “NAND flash screening and qualification guideline for space application,” NASA Electronics and Packaging Program (NEPP), 2011.
    [47] J. M. Rabaey, A.Chandrakasan, and B. Nikolic “Digital integrated circuits,” Pearson Education Taiwan Ltd.
    [48] K. Pagiamtzis and A. Sheikholeslami, “Content-addressable memory (CAM) circuits and architectures: A tutorial and survey,” IEEE J. Solid-State Circuits, vol. 41, no. 3, pp. 712–727, Mar. 2006.
    [49] R. F. Huang, J. F. Li, J. C. Yeh, and C. W. Wu, “A simulator for evaluating redundancy analysis algorithms of repairable embedded memories,” in Proc. IEEE Int’l Workshop Mem. Technol., Des. Testing (MTDT), pp. 68–73, July 2002.
    [50] Y. Cai, G. Yalcin, O. Mutlu, E. F. Haratsch, A. Cristal, O. S. Unsal, and K. Mai, “Flash correct-and-refresh: retention-aware error management for increased flash memory lifetime,” in Proc. IEEE In’l Conf. Comput. Design (ICCD), pp. 94-101, Sep. 2012.

    QR CODE