簡易檢索 / 詳目顯示

研究生: 蕭子翔
Tzu-Hsiang Hsiao
論文名稱: 使用位址重映射及錯誤抵抗訓練技術以提升基於電阻式記憶體之記憶體內運算系統的良率與可靠度
Address Remapping and Fault-Resilient Training Techniques for Enhancing Yield and Reliability of RRAM-Based In-Memory Computing Systems
指導教授: 呂學坤
Shyue-Kung Lu
口試委員: 呂學坤
Shyue-Kung Lu
許鈞瓏
Chun-Lung Hsu
李進福
Jin-Fu Li
王乃堅
Nai-Jian Wang
黃樹林
Shu-Lin Hwang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 92
中文關鍵詞: 深度學習記憶體內運算電阻式記憶體容錯技術
外文關鍵詞: Deep Learning, In Memory Computing, Resistive RAM, Fault Tolerance Techniques
相關次數: 點閱:291下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

近年來,深度神經網路 (Deep Neural Network, DNN) 被廣泛的應用在許多領域之中,例如自動駕駛、影像辨識以及生物醫學電子設備等,皆可達到一定的正確率 (Accuracy),而使用傳統馮諾伊曼 (Von Neumann) 計算機架構,在計算神經網路矩陣或是向量的相乘時,往往會耗費許多時間以及功耗在記憶體單元與計算單元之間的資料搬移上,因而無法達成即時 (Real time) 辨識的效果,而使用電阻式記憶體 (Resistive Random Access Memory, RRAM) 可以有效率的解決這個問題,其面積與功率的效率都比使用馮諾伊曼架構高。
然而由於製程尚未成熟,在RRAM細胞陣列中,會有許多的錯誤發生,導致矩陣計算時的誤差,進而影響正確率。為了提升使用RRAM運算時的正確率,以往的技術也提出許多容錯技術 [1, 2, 3]。而本篇論文提出權重 (Weight) 矩陣的位址重映射 (Address Remapping, AR) 及錯誤抵抗訓練 (Fault Resilient Training, FRT) 技術,位址重映射技術嘗試避免正負記憶體矩陣中相同邏輯位址發生錯誤的情況,並且提升使用另一個矩陣修復權重值的機率;錯誤抵抗訓練技術則將權重值訓練成集中在原點分布的情形,以減少錯誤發生時所造成的效應,此兩種技術可省略在 [1, 2] 所提技術中,為了實現將記憶體實體 (Physical) 位址更改成任意邏輯 (Logical) 位址的繞線器,且無須根據錯誤的分布重新訓練神經網路。
本研究中實現了位址重映射器 (Address Remapper) 的電路,並且以pytorch [4] 的深度學習框架模擬了錯誤抵抗訓練技術以及正確率,針對不同的可接受正確率下降 (Acceptable Accuracy Drop, AAD) 定義接受率 (Acceptance Rate, AR),並取代傳統修復率 (Repair Rate, RR) 的概念,進而以求得的接受率計算良率 (Yield)。依實驗結果可知,在錯誤比率為10% 且可接受正確率下降為5 % 時,使用LeNet5 [5] 推論MNIST [6] 手寫辨識的資料集,錯誤更正技術 [3] 修復後的正確率為87.66%,良率為85.9 %;使用位址重映射技術的正確率可達94.47%,良率為90.7 %;而結合了錯誤抵抗訓練則可將正確率維持為92.85 %,良率為95.5 %;使用的額外硬體成本總和不超過1 %。


Deep Neural Network (DNN) has been widely used in autonomous driving, image recognition and biomedical electronic equipments. However, by using the traditional Von Neumann computer architecture for calculating the massive matrix-vector multiplications usually takes a lot of time and power consumption to move the data between the memory units and the computing units. This make the real-time identification infeasible. To cure this drawback, Resistive Random Access Memory (RRAM) is widely used for in-memory computing. The incurred area and power overhead are less than those using the conventional von Neumann architecture.
However, due to the immature fabrication process of RRAM, many manufacturing defects may exist in the RRAM cell array and result in erroneous matrix-vector multiplication. This will then affect the recognition accuracy for DNN applications. In order to improve the accuracy when using neuromorphic computing based on RRAM, there are many fault-tolerant techniques proposed [1, 2, 3].
This research project proposed the address remapping (AR) and the fault resilient training (FRT) techniques for mitigating the effects of defective RRAM cells. The AR technique tries to avoid defective cells occurring at the same logical addresses in the positive and the negative memory matrices. Therefore, the probability of using another matrix to compensate the weight value of faulty cells can be greatly improved. FRT technique trains the weight value distributions concentrating to the origin such that the effects of faults can be reduced. These two proposed techniques do not require the complex router used in [1, 2] to realize the mapping of a physical address to an arbitrary logical address. Moreover, it is not necessary to retrain the neural network according to the fault distributions.
The architecture of the address remapper for the AR technique is also proposed. We use the deep learning frameworkpytorch [4] for evaluating the accuracy. Moreover, the acceptance rate (AR) is defined based on the specified acceptable accuracy drop (AAD). Instead of the conventional concepts for evaluating repair rate and yield, AAD is used as the measure for estimating the effective repair rate and yield.
When the injected fault percentage is 10% and the AAD is 5%, we use LeNet5 [5] to inference the MNIST [6] handwriting recognition data set. The accuracy and yield after repair are 87.66 % and 85.9 %, respectively by using the error correction technique [3]. For the proposed AR technique, the accuracy and yield can be improved up to 94.47% and 90.07%, respectively. If the FRT and the AR techniques are both equipped, the accuracy and yield can be further enhanced to 95.83% and 95.49%, respectively. The hardware overhead for implementing both techniques is less than 0.4%.

致謝 I 摘要 II Abstract IV 目錄 VI 圖目錄 X 表目錄 XIV 第一章 簡介 1 1.1 背景及動機 1 1.2 組織架構 5 第二章 深度學習的介紹 6 2.1 神經元與深度神經網路架構 6 2.2 全連結神經網路 8 2.3 卷積神經網路 9 2.4 神經網路的訓練 16 2.4.1 損失函數 16 2.4.2 梯度下降 18 2.4.3 學習率的決定 19 第三章 電阻式記憶體的基本工作原理與應用 20 3.1 電阻式記憶體的基本架構 20 3.2 電阻式記憶體的存取原理 22 3.2.1 寫入操作 22 3.2.2 讀取操作 23 3.3 電阻式記憶體的應用 24 3.3.1 高資料儲存系統 24 3.3.2 隨機存取記憶體 24 3.3.3 記憶體內運算 25 第四章 電阻式記憶體的測試與修復技術 29 4.1 功能型故障模型 29 4.1.1 常見記憶體的通用故障模型 29 4.1.2 電阻式記憶體的特定故障模型 31 4.2 先前的容錯技術 33 4.2.1 列位址重新排列技術 33 4.2.2 錯誤更正技術 35 4.2.3 重新訓練技術 38 第五章 位址重映射技術 40 5.1 位址重映射技術的基本概念 41 5.2 位址重映射技術 44 5.3 產生控制字的流程 49 5.4 位址重映射技術範例 51 5.4.1 使用詳盡搜尋的範例 51 5.4.2 使用位元變補搜尋的範例 54 第六章 錯誤抵抗訓練技術 56 6.1 權重值分布與固定型故障的影響 56 6.2 權重衰減 57 6.3 錯誤抵抗訓練技術 59 第七章 實驗結果 61 7.1 深度學習框架與深度學習模型設定 61 7.2 故障分布與故障模型之設定 62 7.3 正確率分析 63 7.4 接受率分析 66 7.5 良率分析 68 7.6 超大型積體電路實現 70 第八章 結論與未來展望 73 8.1 結論 73 8.2 未來展望 73 參考文獻 74

[1] L. Chen, J. Li, Y. Chen, Q. Deng, J. Shen, X. Liang, and L. Jiang, “Accelerator-friendly neural-network training: learning variations and defects in RRAM crossbar,” in Proc. of the Conference on Design, Automation & Test in Europe (DATE), pp. 19–24, Mar. 2017.
[2] B. Liu, H. Li, Y. Chen, X. Li, Q. Wu, and T. Huang, “Vortex: variation-aware training for memristor x-bar,” in Proc. IEEE Design Automation Conference (DAC), pp. 1–6, June 2015.
[3] J. Y. Hu, K. W. Hou, C. Y. Lo, Y. F. Chou, and C. W. Wu, “RRAM-Based Neuromorphic Hardware Reliability Improvement by Self-Healing and Error Correction,” in Proc. IEEE International Test Conference in Asia (ITC-Asia), Aug. 2018.
[4] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” in Proc. Conf. Neural Info. Process. Syst. Workshop, pp. 1–4, Oct. 2017.
[5] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998.
[6] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015.
[7] J. Dean, “Jeff Dean at AI frontiers: trends and developments in deep learning research,” Jan. 2017. [Online]. Available: https://www.slideshare.net/AIFrontiers/jeff-dean-trends-and-developments-in-deep-learning-research?from_action=save
[8] R. Bez, E. Camerlenghi, A. Modelli, and A. Visconti, “Introduction to flash memory,” in Proc. IEEE, vol. 91, no. 4, pp. 489-502, Apr. 2003.
[9] Y. Li and K. N. Quader, “NAND flash memory: challenges and opportunities,” Computers, vol. 46, pp.23-29, Aug. 2013.
[10] A. S. Spinelli, C. M. Compagnoni, and A. L. Lacaita, “Reliability of NAND flash memories: planar cells and emerging issues in 3D devices,” Computers, vol. 6, no. 2, pp. 16, Apr. 2017, [Online]. Available: https://www.mdpi.com/2073-431X/6/2/16.
[11] B. Schroeder, R. Lagisetty, A. Merchant, “Flash reliability in production: the expected and the unexpected,” in Proc. of the 14th USENIX Conference on File and Storage Technologies (FAST’16), pp. 67–80, Feb. 2016.
[12] S. Yu, “Resistive random access memory (RRAM),” in Synth. Lect. Emerg. Eng. Technol., vol. 2, no. 5, pp. 1–79, Mar. 2016.
[13] J. J. Yang, D. B. Strukov, and D. R. Stewart, “Memristive devices for computing,” Nature Nanotechnnology, vol. 8, no. 1, pp. 13-24, Dec. 2012.
[14] M. N. Baibich et al., “Giant magnetoresistance of (001)Fe/(001)Cr magnetic superlattices,” in The American Physical Society, vol. 61, no. 21, pp. 2472–2475, Nov. 1988.
[15] N. Papandreou, H. Pozidis, A. Pantazi, A. Sebastian, M. Breitwischt, C. Lamt, and E. Eleftheriou, “Programming algorithms for multilevel phase-change memory,” in Proc. IEEE Int’l Symp. on Circuits and Systems, pp. 329–332, May 2011.
[16] J. R. Anderson, “Ferroelectric storage elements for digital computers and switching systems,” IEEE Electrical Engineering, vol. 71, no. 10, pp. 916–922, Oct. 1952.
[17] C. Y. Chen, H. C. Shih, C. W. Wu, C. H. Lin, P. F. Chiu, S. S. Sheu, and F. T. Chen, “RRAM defect modeling and failure analysis based on march test and a novel squeeze-search scheme,” IEEE Trans. Comput., vol. 64, no. 1, pp. 180–190, Jan. 2015.
[18] S. Weidman, “Deep learning with pytorch: a 60 minute blitz training a classifier,” Oct. 2019. [Online]. Available: https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html.
[19] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,” in Proc. Conf. Artificial Intelligence and Statistics (AISTATS), vol. 15, pp.315-323, Apr. 2011.
[20] F. F. Li, R. Krishna, D. Xu, A. Byun, W. Shen, J. Braatz, D. Cai, J. Gwak, De-An Huang, A. Kondrich, F. Yu Lin, D. Mrowca, B. Pan, N. Rai, L. P. Tchapmi, C. Waites, R. Wang, Yi Wen, K. Yang, B. Yi, C. Yuan, K. Zakka, and Y. Zhang, “CS231n convolutional neural networks for visual recognition,” Jan. 2015. [Online]. Available: https://cs231n.github.io/convolutional-networks/
[21] L. Bottou, “Large-scale machine learning with stochastic gradient descent,” in Proc. of COMPSTAT, pp177-186, Sep. 2010.
[22] T. W. Hickmott, “Low-frequency negative resistance in thin anodic oxide films,” Journal of Applied Physics, vol.33, no.9, pp. 2669-2682, Sep.1962.
[23] J. F. Gibbons and W. E. Beadle, “Switching properties of thin NIO films,” Solid-State Electron, vol. 7, no. 11, pp. 785–790, Nov. 1964.
[24] G. Dearnale, A. M. Stoneham, and D. V. Morgan, “Electrical phenomena in amorphous oxide films,” Rep. Progr. Phys., vol. 33, pp. 1129–1191, Sep. 1970.
[25] Y. Watanabe, J. G. Bednorz, A. Bietsch, Ch. Gerber, D. Widmer, A. Beck, and S. J. Wind, “Current-driven insulator-conductor transition and nonvolatile memory in chromium-doped SrTiO3 single crystals,” Appl. Phys. Lett., vol. 78, no. 23, pp. 3738–3740, June 2001.
[26] A. Beck, J. G. Bednorz, C. Gerber, C. Rossel, and D. Widmer, “Reproducible switching effect in thin oxide films for memory applications,” Appl. Phys. Lett., vol. 77, no. 1, pp. 139–141, July 2000.
[27] S. Seo, M. J. Lee, D. H. Seo, E. J. Jeoung, D. S. Suh, Y. S. Joung, and I. K. Yoo, “Reproducible resistance switching in polycrystalline NiO films,” Appl. Phys. Lett., vol. 85, no. 23, pp. 5655–5657, Dec. 2004.
[28] C. Rohde, B. J. Choi, D. S. Jeong, S. Choi, J. S. Zhao, and C. S. Hwang, “Identification of a determining parameter for resistive switching of TiO2 thin films,” Appl. Phys. Lett., vol. 86, pp. 262907-1–262907-3, Jun. 2005.
[29] L.O. Chua, “Memristor – the missing circuit element,” IEEE Trans. on Circuit Theory, vol. 18, no. 5, pp. 507–519, Sep. 1971.
[30] H. S. P. Wong, H. Y. Lee, S. Yu, Y. S. Chen, Y. Wu, P. S. Chen, B. Lee, F. T. Chen, and M. J. Tsai, “Metal-Oxide RRAM,” in Proc. of the IEEE, vol. 100, no. 6, pp. 1951–1970, June 2012.
[31] P. Y. Chen, and S. Yu, “Impact of vertical RRAM device characteristics on 3D cross-point array design,” in Proc. IEEE 6th Int’l Memory Workshop (IWM), pp. 127–130, May 2014.
[32] W. A. Wulf and S. A. McKee, “Hitting the memory wall: implications of the obvious,” ACM SIGARCH Computer Architecture News, vol. 23, no.1, pp. 20-24, Mar. 1995.
[33] D. L. and H. -S. P. W, “In-memory computing with resistive switching devices,” Nature Electronics, vol. 1, no. 6, pp. 333-343, June 2018.
[34] F. Alibart, L. Gao, B. D Hoskins, and D. B Strukov, “High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm,” Nanotechnology, vol. 23, no. 7, p. 075201, Jan. 2012.
[35] M. Hu, J. P. Strachan, Z. Li, E. M. Grafals, N. Davila, C. Graves, S. Lam, N. Ge, J. Yang, and P. S. Williams, “Dot-product engine for neuromorphic computing: programming 1t1m crossbar to accelerate matrix-vector multiplication,” in Proc. IEEE Design Automation Conference (DAC), June. 2016.
[36] W. Huangfu, L. Xia, M. Cheng, X. Yin, T. Tang, B. Li, K. Chakrabarty, Y. Xie, Yu Wang, and H. Yang, “Computation-oriented fault-tolerance schemes for RRAM computing systems,” in Design Automation Conference (ASP-DAC), 22nd Asia and South Pacific, pp. 794–799, Jan. 2017.
[37] L. Xia , W. Huangfu, T. Tang, X. Yin, K. Chakrabarty, Y. Xie, Yu Wang, and H.Yang, “Stuck-at Fault Tolerance in RRAM Computing Systems,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, pp. 102-115, Nov. 2017.
[38] C. Liu, M. Hu, J. P. Strachan, and H. Li, “Rescuing Memristor-based Neuromorphic Design with High Defects,” in Proc. IEEE Design Automation Conference (DAC), pp. 18-22, Oct. 2017.
[39] T. Karnik, P. Hazucha, and J. Patel, “Characterization of soft errors caused by single event upsets in CMOS processes,” IEEE Trans. Depend. Secure Computing, vol. 1, no. 2, pp. 128–143, Apr.–June 2004.
[40] W. Kuo, W. T. K. Chien, and T. Kim, “Reliability, yield, and stress burn-in,” Kluwer Academic Publishers, Jan. 1998.
[41] K. Sachhidh, J. Rajendran, R. Karri, and O. Sinanoglu, “Sneak path testing of metal-oxide memristor-based memories,” in Proc. 26th Int’l Conf. VLSI Design, pp. 386–391, Jan. 2013.
[42] R. Dekker, F. Beenker, and L. Thijssen, “Fault modeling and test algorithm development for static random-access memories,” in Proc. IEEE Int’l Test Conf., pp. 343–352, Sep. 1988.
[43] H. W. Kuhn, “The hungarian method for the assignment problem,” in Naval Research Logistics Quarterly, vol. 2, pp.83-97, 1955.
[44] Z. He, J. Lin, R. Ewetz, J. –S. Yuan, and D. Fan, “Noise injection adaption: end-to-end ReRAM crossbar non-ideal effect adaption for neural network mapping,” in Proc. IEEE Design Automation Conference (DAC), pp. 2-6, June 2019.
[45] A. Krogh, and J. A. Hertz, “A simple weight decay can improve generalization,” Advances Neural Inform. Processing Syst., vol. 4, pp. 950-957, Dec. 1991.
[46] J. Despois, “Memorizing is not learning! -6 tricks to prevent overfitting in machine learning,” Mar. 2018. [Online]. Available: https://hackernoon.com/memorizing-is-not-learning-6-tricks-to-prevent-overfitting-in-machine-learning-820b091dc42.

無法下載圖示 全文公開日期 2025/08/06 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE