研究生: |
鄭春章 Chun-Zhang Zheng |
---|---|
論文名稱: |
一個基於混合式Computational Storage架構應用於加速CNN神經網路學習 A Hybrid Computational Storage Architecture to Accelerate CNN Training |
指導教授: |
吳晋賢
Chin-Hsien Wu |
口試委員: |
吳晋賢
Chin-Hsien Wu 陳維美 Wei-Mei Chen 林淵翔 Yuan-Hsiang Lin 林昌鴻 Chang-Hong Lin |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 67 |
中文關鍵詞: | 計算儲存體 、快閃記憶體 、捲積神經網路 |
外文關鍵詞: | Computional Storage, NAND Flash Memory, Convolution Neural Networks |
相關次數: | 點閱:241 下載:17 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在儲存裝置的領域中,運算儲存裝置(Computational Storage Drive, CSD)是近年來新興的研究領域主題,CSD具有省電及高速運算且高度平行的計算能力等優勢,其技術使越來越多廠商與學者加入研究。然而,現有的應用皆是儲存與運算分離在不同裝置上執行,若想將運算的工作搬移到儲存裝置的層級,我們勢必需要高速運算與高度平行的儲存裝置方案,而CSD正是目前最具潛力的選項之一,被視為未來商用儲存裝置的新型態儲存裝置,比起傳統SSD有更強大的運算效能,及更多樣化的應用可在上面進行發展。
由於CSD具有高速運算與高度平行的特性,在使用NAND Flash時,可以充分使用SSD上高度平行化存取策略,如Multi-channel Parallel [1]、Multiple Dies Parallel…等,利用各種內部平行化機制可以使得CSD有著比一般應用情境下更佳的資料存取速度。本論文針對CSD應用於CNN神經網路時所會遇到的問題加以探討與研究,並提出適用於CSD架構的方法來加速CNN神經網路,以及同時混合使用不同性能的CSD的效能分析與探討。
In storage systems, Computational Storage Drive, CSD has become popular these years. The properties of Computational Storage Drive contain low-power, high-speed and high-parallel computing. However, each application is executed on different devices. If we want to move the operation of computing into the storage level, we must need a high-speed and high-parallel computing solution. CSD is one of the most potential options. It is a new type of storage device for future commercial storage devices. It has more powerful computing performance than traditional SSDs, and more diverse applications can be developed on it.
In addition, because CSD has the characteristics of high-speed and highly parallel computing, when using NAND Flash, you can fully use the highly parallel access strategy on SSD. It can use such as Multi-channel Parallel [1], Multiple Dies Parallel, etc. Using various internal parallelization mechanisms can make CSD have better data access speed than in general application scenarios. This paper will discuss the problems encountered when CSD is applied to CNN neural networks. Finally, a method suitable for the CSD architecture is proposed to accelerate the CNN neural network, and the performance analysis and discussion of hybrid CSDs with different performances at the same time.
[1] O. Yang, N. Xiao, M. Lai, "A scalable multi-channel parallel NAND flash memory controller architecture." 2011 Sixth Annual Chinagrid Conference. IEEE, 2011.
[2] T. Li, Z. Lei, "A novel multiple dies parallel nand flash memory controller for high-speed data storage." 2017 13th IEEE International Conference on Electronic Measurement & Instruments (ICEMI). IEEE, 2017.
[3] C. Zambelli, R. Bertaggia, L. Zuolo, R. Micheloni, "Enabling Computational Storage Through FPGA Neural Network Accelerator for Enterprise SSD." IEEE Transactions on Circuits and Systems II: Express Briefs 66.10 (2019): 1738-1742.
[4] M. Torabzadehkashi, S. Rezaei, V. Alves, N. Bagherzadeh, "Compstor: an in-storage computation platform for scalable distributed processing." 2018 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). IEEE, 2018.
[5] W. H. Wu, " A Retention-Error Mitigation Method based on TLC NAND Flash Memory," National Taiwan University of Science and Technology, 2019.
[6] ADVANTECH, company, Tech. Rep., 2016.
[7] M. Torabzadehkashi, A. Heydarigorji, S. Rezaei, H. Bobarshad, V. Alves, N. Bagherzadeh, "Accelerating HPC applications using computational storage devices." 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE, 2019.
[8] C. Yakopcic, M. Z. Alom, T. M. Taha, "Extremely parallel memristor crossbar architecture for convolutional neural network implementation." 2017 International Joint Conference on Neural Networks (IJCNN). IEEE, 2017.
[9] K. Fukushima, S. Miyake, T. Ito, "A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position." Biol. Cybern. 36 (1980): 193-202.
[10] Y. Lecun, L. Bottou, Y. Bengio, P. Haffner, "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
[11] "Convolutional Neural Networks (LeNet) - Deep Learning 0.1" in DeepLearning 0.1, LISA Lab.
[12] A. Krizhevsky, I. Sutskever, G. E. Hinton, "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
[13] K. Simonyan, A. Zisserman, "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[14] K. Sun, S. Li, Y. Luo, R. Renteria, K. Choi, "Highly-efficient parallel convolution acceleration by using multiple GPUs." 2017 International SoC Design Conference (ISOCC). IEEE, 2017.
[15] D. Li, Y. Yang, W. Li, Q. Yang, "CISC: Coordinating Intelligent SSD and CPU to Speedup Graph Processing." 2018 17th International Symposium on Parallel and Distributed Computing (ISPDC). IEEE, 2018.
[16] G. Begna, D. B. Rawat, M. Garuba, L. Njilla, "SecureCASH: Securing Context-Aware Distributed Storage and Query Processing in Hybrid Cloud Framework." 2018 IEEE Conference on Communications and Network Security (CNS). IEEE, 2018.
[17] M. Torabzadehkashi, S. Rezaei, A. Heydarigorji, H. Bobarshad, V. Alves, N. Bagherzadeh, "Catalina: In-storage processing acceleration for scalable big data analytics." 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP). IEEE, 2019.
[18] X. Song, T. Xie, W. Pan, "RISP: a reconfigurable in-storage processing framework with energy-awareness." 2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 2018.
[19] J. C. Vega, Q. C. Shen, P. Chow, "SHIP: Storage for Hybrid Interconnected Processors." 2020 IEEE 28th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2020.
[20] A. A. Devarajan, T. SudalaiMuthu, "Cloud Storage Monitoring System analyzing through File Access Pattern." 2019 International Conference on Computational Intelligence in Data Science (ICCIDS). IEEE, 2019.
[21] Z. He, J. Kuang, Y. Tan, W. Liu, B. Sheng, "Design and Implementation of GPU Accelerated Active Storage in FastDFS." 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE, 2019.
[22] K. Sun, S. Li, Y. Luo, R. Renteria, K. Choi, "Highly-efficient parallel convolution acceleration by using multiple GPUs." 2017 International SoC Design Conference (ISOCC). IEEE, 2017.
[23] N. Dryden, N. Maruyama, T. Benson, T. Moon, M. Snir, B. V. Essen, "Improving strong-scaling of CNN training by exploiting finer-grained parallelism." 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 2019.
[24] S. Guedria, N. D. Palma, F. Renard, N. Vuillerme, "Auto-CNNp: a component-based framework for automating CNN parallelism." 2019 IEEE International Conference on Big Data (Big Data). IEEE, 2019.