簡易檢索 / 詳目顯示

研究生: 黎子豪
Tzu-Hao Li
論文名稱: 閘控遞迴神經網路之類比積體電路設計
Analog Integrated Circuit Design for Gated Recurrent Unit Neural Network
指導教授: 彭盛裕
Sheng-Yu Peng
口試委員: 蘇順豐
Shun-Feng Su
曹昱
Yu Tsao
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 87
中文關鍵詞: 閘控遞迴神經網路
外文關鍵詞: Recurrent Neuron Network, Long ShortTerm Memory, Log Domain Low Pass Filter, Exponentially Weighted Moving Average
相關次數: 點閱:213下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報


Contents Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Background Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Recurrent Neuron Network . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Long Short Term Memory (LSTM) . . . . . . . . . . . . . . . . . . . . . 9 2.3 Gated Recurrent Unit (GRU) . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Gated Recurrent Unit Analog Circuit Design . . . . . . . . . . . . . . . . . . . 14 3.1 Gated Recurrent Unit in Pytorch . . . . . . . . . . . . . . . . . . . . . . 14 3.1.1 Model Define . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.1.2 MITBIH Database . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.1.3 Network Define . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.4 Classification Result . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 GRU Circuit Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.1 Computing In SRAM . . . . . . . . . . . . . . . . . . . . . . . . 28 3.2.1.1 Four Quadrant Multiplication Implementation . . . . . 30 3.2.1.2 SRAM Read/Write . . . . . . . . . . . . . . . . . . . 34 3.2.1.3 Multiply Accumulate Operation . . . . . . . . . . . . 35 3.2.2 Activation Function . . . . . . . . . . . . . . . . . . . . . . . . 36 3.2.2.1 Sigmoid Function . . . . . . . . . . . . . . . . . . . . 37 3.2.2.2 tanh Function . . . . . . . . . . . . . . . . . . . . . . 40 3.2.3 Update Gate and Reset Gate . . . . . . . . . . . . . . . . . . . . 43 3.2.4 h State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2.4.1 Change of the Activation Function . . . . . . . . . . . 45 3.2.4.2 An Extra Activation Function . . . . . . . . . . . . . . 46 3.2.5 Hidden State . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.2.5.1 Log Domain Low Pass Filter . . . . . . . . . . . . . . 48 3.2.5.2 Exponentially Weighted Moving Average . . . . . . . . 50 3.2.5.3 Hidden State Equation Rewrite . . . . . . . . . . . . . 51 3.2.6 Hardware Friendly Gated Recurrent Unit . . . . . . . . . . . . . 52 3.2.7 Hardware Friendly GRU Verification . . . . . . . . . . . . . . . 55 3.2.7.1 Pytorch Model Rewriting . . . . . . . . . . . . . . . . 55 3.2.7.2 Hardware Friendly GRU Classification Result . . . . . 57 4 Measurement Results and Comparison . . . . . . . . . . . . . . . . . . . . . . 61 4.1 Die Photo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.2 Activation Function Measurement Results . . . . . . . . . . . . . . . . . 62 4.3 Low Pass Filter Measurement Results . . . . . . . . . . . . . . . . . . . 66 4.4 Comparison Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Letter of Authority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 List of Figures 1.1 Energy Efficiency wall [1] . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Concept of Von Neumann architecture . . . . . . . . . . . . . . . . . . . 3 1.3 Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.1 Jordan Network from [6] . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Elman Network from [7] . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.3 Discrete Time Recurrent Unit . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 Long Short Term Memory(LSTM) . . . . . . . . . . . . . . . . . . . . . 10 2.5 GRU Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1 Define Initial State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Define Forward State . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Pytorch GRU Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.4 MITBIH Database ECG . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.5 MITBIH Database ECG Annotation . . . . . . . . . . . . . . . . . . . . 19 3.6 MITBIH Database Number 208 ECG . . . . . . . . . . . . . . . . . . . 20 3.7 MITBIH Database Number 213 ECG . . . . . . . . . . . . . . . . . . . 20 3.8 DataLoader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.9 Set Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.10 Early Stopping Mechanism and Validation . . . . . . . . . . . . . . . . . 22 3.11 Matrix Calculation in GRU . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.12 GRU Training Loss . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.13 GRU Classification Result Receiver Operating Characteristic . . . . . . . 25 3.14 GRU Classification Confusion Matrix . . . . . . . . . . . . . . . . . . . 26 3.15 Block diagram of the implementated GRU Chip . . . . . . . . . . . . . . 27 3.16 Schematic of the W2W Current DAC . . . . . . . . . . . . . . . . . . . 28 3.17 Current DAC With SRAM Concept . . . . . . . . . . . . . . . . . . . . 29 3.18 Computing In SRAM Unit Cell . . . . . . . . . . . . . . . . . . . . . . . 30 3.19 Computing In SRAM Current DAC . . . . . . . . . . . . . . . . . . . . 30 3.20 Computing In SRAM Current DAC Block Diagram . . . . . . . . . . . . 31 3.21 Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.22 Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.23 Example 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.24 Example 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.25 SRAM Read/Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.26 Multiply Accumulate Operation Implementation . . . . . . . . . . . . . 36 3.27 Activayion Function Circuit . . . . . . . . . . . . . . . . . . . . . . . . 37 3.28 Common Mode Cancelation(CMC) Circuit . . . . . . . . . . . . . . . . 39 3.29 Activation Function with CMC Circuit . . . . . . . . . . . . . . . . . . . 39 3.30 Activation Function with CMC Circuit Simulation Result . . . . . . . . . 40 3.31 Icm Sweep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.32 tanh Circuit Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.33 Current Substractor and Absolutor . . . . . . . . . . . . . . . . . . . . . 42 3.34 Simulation Result of the Tanh Circuit . . . . . . . . . . . . . . . . . . . 42 3.35 Update Gte and Reset Gate Block Diagram . . . . . . . . . . . . . . . . 43 3.36 eh Circuit Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.37 Block diagram of the tanh activation and it’s connection . . . . . . . . . . 46 3.38 Activation Biasing Activation Function . . . . . . . . . . . . . . . . . . 47 3.39 Log Domain Low Pass Filter . . . . . . . . . . . . . . . . . . . . . . . . 48 3.40 Time Constant Changing . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.41 RC Low Pass Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.42 Hardware Friendly GRU Algorithm . . . . . . . . . . . . . . . . . . . . 52 3.43 The Implemented GRU Unit Cell Block Diagram . . . . . . . . . . . . . 53 3.44 The Implemented GRU Array Block Diagram . . . . . . . . . . . . . . . 54 3.45 Initial State of the Hardware Friendly GRU . . . . . . . . . . . . . . . . 55 3.46 Forward State of the Hardware Friendly GRU . . . . . . . . . . . . . . . 56 3.47 Hardware Friendly GRU Training Loss . . . . . . . . . . . . . . . . . . 58 3.48 Hardware Friendly GRU Classification Receiver Operating Characteristic 59 3.49 Hardware Friendly GRU Classification Confusion Matrix . . . . . . . . . 60 4.1 Hardware Friendly GRU Die Photo and Layout . . . . . . . . . . . . . . 61 4.2 Activation Function Measurement and Simulation Result . . . . . . . . . 62 4.3 Activation Function Fitting Model . . . . . . . . . . . . . . . . . . . . . 64 4.4 Activation Function Fitting Training Setup . . . . . . . . . . . . . . . . . 64 4.5 Activation Function Fitting Result Parameter . . . . . . . . . . . . . . . 65 4.6 Activation Function Fitting Result Curve . . . . . . . . . . . . . . . . . . 65 4.7 Icmb Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 4.8 Current DAC State Response(20KHz) . . . . . . . . . . . . . . . . . . . 67 4.9 Low Pass Filter Input Current . . . . . . . . . . . . . . . . . . . . . . . 67 4.10 LowPass Filter Output Current . . . . . . . . . . . . . . . . . . . . . . . 68 4.11 Low Pass Filter τ Current . . . . . . . . . . . . . . . . . . . . . . . . . 68 List of Tables 3.1 GRU Classification Result of Data 208 . . . . . . . . . . . . . . . . . . . 23 3.2 GRU Classification Result of Data 213 . . . . . . . . . . . . . . . . . . . 23 3.3 Hardware Friendly GRU Classification Result of Data 208 . . . . . . . . 57 3.4 Hardware Friendly GRU Classification Result of Data 213 . . . . . . . . 57 4.1 Comparison Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

[1] B. M. J. Hasler, “Finding a roadmap to achieve large neuromorphic hardware systems,” Frontiers in Neuroscience, 2013.
[2] J. Amoh and K. Odame, “Deep neural networks for identifying cough sounds,” IEEE transactions on biomedical circuits and systems, vol. 10, no. 5, pp. 1003–1011, 2016.
[3] A. Jaiswal, I. Chakraborty, A. Agrawal, and K. Roy, “8t sram cell as a multibit dotproduct engine for beyond von neumann computing,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 27, no. 11, pp. 2556–2567, 2019.
[4] J. Amoh and K. M. Odame, “An optimized recurrent unit for ultra­low­power keyword spotting,” Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, vol. 3, no. 2, pp. 1–17, 2019.
[5] A. Zhou, A. Yao, Y. Guo, L. Xu, and Y. Chen, “Incremental network quantization: Towards lossless cnns with low­precision weights,” arXiv preprint arXiv: 1702.03044, 2017.
[6] M. I. Jordan, “Serial order: A parallel distributed processing approach,” in Advances in psychology, vol. 121, pp. 471–495, Elsevier, 1997.
[7] J. L. Elman, “Finding structure in time,” Cognitive science, vol. 14, no. 2, pp. 179– 211, 1990.
[8] S. Hochreiter and J. Schmidhuber, “Long short­term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
[9] K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder­decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
[10] G. B. Moody and R. G. Mark, “The impact of the mit­bih arrhythmia database,”
IEEE Engineering in Medicine and Biology Magazine, vol. 20, no. 3, pp. 45–50, 2001.
[11] A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C.­K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” circulation, vol. 101, no. 23, pp. e215–e220, 2000.
[12] R. J. Bakeker, “Cmos mixed­signal circuit design,” 2002.
[13] M. Punzenberger and C. Enz, “A 1.2 v bicmos class ab log­domain filter,” in 1997 IEEE International Solids­State Circuits Conference. Digest of Technical Papers, pp. 56–57, IEEE, 1997.
[14] B. A. Minch, “Analysis and synthesis of static translinear circuits,” School of Electrical and Computer Engineering, Cornell University, NY, pp. 95–1, 2000.
[15] Y. Xu, Z. Chen, F. Li, and J. Meng, “A granular resampling method and adaptive speculative mechanism­based energy­efficient architecture for multiclass heartbeat classification,” IEEE Transactions on Computer­Aided Design of Integrated Circuits and Systems, vol. 38, no. 11, pp. 2172–2176, 2018.
[16] Y. Zhao, Z. Shang, and Y. Lian, “A 13.34 µw event­driven patient­specific ann cardiac arrhythmia classifier for wearable ecg sensors,” IEEE transactions on biomedical circuits and systems, vol. 14, no. 2, pp. 186–197, 2019.

無法下載圖示 全文公開日期 2031/10/18 (校內網路)
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE