研究生: |
李弈鴻 I-Hung Li |
---|---|
論文名稱: |
利用機器學習改善固態硬碟垃圾回收機制 Using Machine Learning to Improve Garbage Collection In SSD |
指導教授: |
吳晋賢
Chin-Hsien Wu |
口試委員: |
林淵翔
Yuan-Hsiang Lin 林昌鴻 Chang Hong Lin 陳維美 Wei-Mei Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 中文 |
論文頁數: | 56 |
中文關鍵詞: | 機器學習 、垃圾回收 、固態硬碟 、閥值 、工作環境 、抹除 |
外文關鍵詞: | Machine Learning, Garbage Collection, SSD, Threshold, Workload, Erase |
相關次數: | 點閱:410 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在固態硬碟當中,垃圾回收機制一直是相當重要的一部份,當可用空間過少的時候,會不斷觸發垃圾回收,將區塊內有效資料進行搬移,再進行區塊的抹除,執行一次垃圾回收是相當花時間的。垃圾回收觸發次數過多會造成固態硬碟壽命下降, 以及頻繁進行垃圾回收會導致I/O效能下降,我們希望提出一個方式,能利用機器學習在垃圾回收上,經由機器學習模型適時的執行垃圾回收,減少垃圾回收的次數與反應時間。在實驗部分,我們修改固態硬碟模擬器(MQSim)來實踐本論文的方法,將FTL層級獲得的參數分群成會執行垃圾回收與不會進行垃圾回收來當作訓練資料,為了能夠使用到輕量化而且高準確度的機器學習模型,在多種機器學習方法的分析下,決定使用隨機森林演算法,希望能透過本論文的方法提高I/O效能並且延長固態硬碟的壽命。
Garbage Collection (GC) is playing an important role in solid-state drives (SSDs). When SSDs run out of free space, SSDs will keep doing GC until free space is over than the GC threshold. In the GC operation, how to select a victim block is based on a GC policy. After a victim block selected, GC copies the valid pages from the victim block to a new free block, and erase the victim block to be a free block. GC could spend lots of time, decrease the I/O performance and reduce the endurance of SSDs. In the thesis, we will use machine learning to predict a suitable moment to trigger GC to reduce the GC execution times and its response time. In the experiments, we have revised MQSim that is an SSD simulator to implement the proposed method. Collect data of the flash translation layer(FTL) and label the data as GC or non-GC to be training data. We also do the experiment and compare with different machine learning method in order to use the light-weighting machine learning model with high accuracy. After analyzing, we decide to apply Random Forest in the machine learning method. We also demonstrate that the proposed method can improve the I/O performance and endurance of SSD.
[1] A. Gupta, Y. Kim, and B. Urgaonkar, “Dftl: A flash translation layer employing demand-based selective caching of page-level address mappings,” SIGPLAN Not., vol. 44,no.3,pp.229–240,Mar.2009.[Online].Available: http://doi.acm.org/10.1145/1508284.1508271
[2] J. Kim, J. M. Kim, S. H. Noh, S. L. Min, and Y. Cho, “A space-efficient flash translation layer for compact flash systems,” IEEE Transactions on Consumer Electronics, vol. 48, no. 2, pp. 366–375, May 2002.
[3] S. Lee, D. Shin, Y.-J. Kim, and J. Kim, “Last: Locality-aware sector translation for nand flash memory-based storage systems,” SIGOPS Oper. Syst. Rev., vol. 42, no. 6, pp.36–42,Oct.2008.[Online].Available: http://doi.acm.org/10.1145/1453775.1453783
[4] S.-W. Lee, W.-K. Choi, and D.-J. Park, “Fast: An efficient flash translation layer for flash memory,” in EUC Workshops, 2006.
[5] A. Tavakkol, J. G´omez-Luna, M. Sadrosadati, S. Ghose, and O. Mutlu, “Mqsim: A framework for enabling realistic studies of modern multi-queue SSD devices,” in 16th USENIX Conference on File and Storage Technologies (FAST 18). Oakland, CA: USENIX Association,2018,pp.49–66.[Online].Available: https://www.usenix.org/conference/fast18/presentation/tavakkol
[6] Y. Yang, V. Misra, and D. Rubenstein, “On the optimality of greedy garbage collection for ssds,” ACM SIGMETRICS Performance Evaluation Review, vol. 43, pp. 63–65, 09 2015.
[7] Y. Li, P. P. Lee, and J. C. Lui, “Stochastic modeling of large-scale solid-state storage systems: Analysis, design tradeoffs and optimization,” SIGMETRICS Perform. Eval. Rev.,vol.41,no.1,pp.179–190,Jun.2013.[Online].Available: http://doi.acm.org/10.1145/2494232.2465546
[8] W.-K. Ching, M. K. Ng, and W. Ching, Markov Chains: Models, Algorithms and Applications (International Series in Operations Research & Management Science). Berlin, Heidelberg: Springer-Verlag, 2006.
[9] B. Van Houdt, “A mean field model for a class of garbage collection algorithms in flash-based solid state drives,” SIGMETRICS Perform. Eval. Rev., vol. 41, no. 1, pp. 191–202,Jun.2013.[Online].Available: http://doi.acm.org/10.1145/2494232.2465543 1
[10] M. Benaim and J.-Y. Le Boudec, “A class of mean field interaction models for computer and communication systems,” vol. 65, 05 2008, pp. 589–590.
[11] J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, Mar 1986. [Online]. Available: https://doi.org/10.1007/BF00116251
[12] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001. [Online]. Available: https://doi.org/10.1023/A:1010933404324
[13] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no.3,pp.273–297,Sep 1995. [Online]. Available: https://doi.org/10.1007/BF00994018
[14] D. F. Specht, “A general regression neural network,” IEEE Transactions on Neural Networks, vol. 2, no. 6, pp. 568–576, Nov 1991.
[15] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regularization,” CoRR, vol. abs/1409.2329, 2014. [Online]. Available: http://arxiv.org/abs/1409.2329
[16] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates,Inc.,2012,pp.1097–1105.[Online].Available: http://papers.nips.cc/paper/4824-imagenetclassification-with-deep-convolutional-neural-networks.pdf
[17] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre, G. van den Driessche, T. Graepel, and D. Hassabis, “Mastering the game of go without human knowledge,” Nature, vol. 550, pp. 354 EP –, Oct 2017, article. [Online]. Available: https://doi.org/10.1038/nature24270
[18] C. J. C. H. Watkins and P. Dayan, “Q-learning,” in Machine Learning, 1992, pp. 279–292.
[19] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, pp. 529 EP –, Feb 2015. [Online]. Available: https://doi.org/10.1038/nature14236
[20] J. K. Park and J. Kim, “A method for reducing garbage collection overhead of ssd using machine learning algorithms,” in 2017 International Conference on Information and Communication Technology Convergence (ICTC), Oct 2017, pp. 775–777.
[21] W. Kang, D. Shin, and S. Yoo, “Reinforcement learning-assisted garbage collection to mitigate long-tail latency in ssd,” ACM Trans. Embed. Comput. Syst., vol. 16,no.5s,pp.134:1–134:20,Sep.2017.[Online].Available: http://doi.acm.org/10.1145/3126537
[22] J. Lee, Y. Kim, G. M. Shipman, S. Oral, and J. Kim, “Preemptible i/o scheduling of garbage collection for solid state drives,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 32, no. 2, pp. 247–260, Feb 2013. [23] C. Lee, T. Kumano, T. Matsuki, H. Endo, N. Fukumoto, and M. Sugawara, “Understanding storage traffic characteristics on enterprise virtual desktop infrastructure,” in Proceedings of the 10th ACM International Systems and Storage Conference, ser. SYSTOR ’17. New York, NY, USA: ACM, 2017, pp. 13:1–13:11. [Online]. Available: http://doi.acm.org/10.1145/3078468.3078479
[24] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. VanderPlas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in python,” CoRR, vol. abs/1201.0490, 2012. [Online]. Available: http://arxiv.org/abs/1201.0490
[25] S. Behnel, R. Bradshaw, C. Citro, L. Dalcin, D. S. Seljebotn, and K. Smith, “Cython: The best of both worlds,” Computing in Science Engineering, vol. 13, no. 2, pp. 31–39, March 2011.