簡易檢索 / 詳目顯示

研究生: 李弈鴻
I-Hung Li
論文名稱: 利用機器學習改善固態硬碟垃圾回收機制
Using Machine Learning to Improve Garbage Collection In SSD
指導教授: 吳晋賢
Chin-Hsien Wu
口試委員: 林淵翔
Yuan-Hsiang Lin
林昌鴻
Chang Hong Lin
陳維美
Wei-Mei Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 56
中文關鍵詞: 機器學習垃圾回收固態硬碟閥值工作環境抹除
外文關鍵詞: Machine Learning, Garbage Collection, SSD, Threshold, Workload, Erase
相關次數: 點閱:224下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在固態硬碟當中,垃圾回收機制一直是相當重要的一部份,當可用空間過少的時候,會不斷觸發垃圾回收,將區塊內有效資料進行搬移,再進行區塊的抹除,執行一次垃圾回收是相當花時間的。垃圾回收觸發次數過多會造成固態硬碟壽命下降, 以及頻繁進行垃圾回收會導致I/O效能下降,我們希望提出一個方式,能利用機器學習在垃圾回收上,經由機器學習模型適時的執行垃圾回收,減少垃圾回收的次數與反應時間。在實驗部分,我們修改固態硬碟模擬器(MQSim)來實踐本論文的方法,將FTL層級獲得的參數分群成會執行垃圾回收與不會進行垃圾回收來當作訓練資料,為了能夠使用到輕量化而且高準確度的機器學習模型,在多種機器學習方法的分析下,決定使用隨機森林演算法,希望能透過本論文的方法提高I/O效能並且延長固態硬碟的壽命。


    Garbage Collection (GC) is playing an important role in solid-state drives (SSDs). When SSDs run out of free space, SSDs will keep doing GC until free space is over than the GC threshold. In the GC operation, how to select a victim block is based on a GC policy. After a victim block selected, GC copies the valid pages from the victim block to a new free block, and erase the victim block to be a free block. GC could spend lots of time, decrease the I/O performance and reduce the endurance of SSDs. In the thesis, we will use machine learning to predict a suitable moment to trigger GC to reduce the GC execution times and its response time. In the experiments, we have revised MQSim that is an SSD simulator to implement the proposed method. Collect data of the flash translation layer(FTL) and label the data as GC or non-GC to be training data. We also do the experiment and compare with different machine learning method in order to use the light-weighting machine learning model with high accuracy. After analyzing, we decide to apply Random Forest in the machine learning method. We also demonstrate that the proposed method can improve the I/O performance and endurance of SSD.

    第一章 緒論 1 1.1 前言 1 1.2 論文架構 3 第二章 環境背景和研究動機 4 2.1 現代SSD架構 4 2.2 垃圾回收 5 2.3 機器學習 7 2.4 機器學習在垃圾收集上的應用 12 2.5 研究動機 13 第三章 研究方法 14 3.1 整體架構 14 3.2 訓練資料 15 3.3 資料處理 16 3.4 機器學習方法 17 3.5 決策樹 18 3.6 隨機森林 19 第四章 實驗與效能分析 21 4.1 實驗概述 21 4.2 MQSim 23 4.3 Scikit-learn 24 4.4 Cython 25 4.5 效能分析 26 4.5.1 實驗環境與模擬器參數 26 4.5.2 實驗Workload分析 27 4.5.3 在Synthetic裡使用不同GC方式與使用機器學習前後差別 29 4.5.4 在Systor17裡使用不同GC方式與使用機器學習前後差別 33 4.5.5 在Prxy_1裡使用不同GC方式與使用機器學習前後差別 37 4.5.6 在Enterprise裡使用不同GC方式與使用機器學習前後差別 41 第五章 結論 45 參考文獻 46

    [1] A. Gupta, Y. Kim, and B. Urgaonkar, “Dftl: A flash translation layer employing demand-based selective caching of page-level address mappings,” SIGPLAN Not., vol. 44,no.3,pp.229–240,Mar.2009.[Online].Available: http://doi.acm.org/10.1145/1508284.1508271
    [2] J. Kim, J. M. Kim, S. H. Noh, S. L. Min, and Y. Cho, “A space-efficient flash translation layer for compact flash systems,” IEEE Transactions on Consumer Electronics, vol. 48, no. 2, pp. 366–375, May 2002.
    [3] S. Lee, D. Shin, Y.-J. Kim, and J. Kim, “Last: Locality-aware sector translation for nand flash memory-based storage systems,” SIGOPS Oper. Syst. Rev., vol. 42, no. 6, pp.36–42,Oct.2008.[Online].Available: http://doi.acm.org/10.1145/1453775.1453783
    [4] S.-W. Lee, W.-K. Choi, and D.-J. Park, “Fast: An efficient flash translation layer for flash memory,” in EUC Workshops, 2006.
    [5] A. Tavakkol, J. G´omez-Luna, M. Sadrosadati, S. Ghose, and O. Mutlu, “Mqsim: A framework for enabling realistic studies of modern multi-queue SSD devices,” in 16th USENIX Conference on File and Storage Technologies (FAST 18). Oakland, CA: USENIX Association,2018,pp.49–66.[Online].Available: https://www.usenix.org/conference/fast18/presentation/tavakkol
    [6] Y. Yang, V. Misra, and D. Rubenstein, “On the optimality of greedy garbage collection for ssds,” ACM SIGMETRICS Performance Evaluation Review, vol. 43, pp. 63–65, 09 2015.
    [7] Y. Li, P. P. Lee, and J. C. Lui, “Stochastic modeling of large-scale solid-state storage systems: Analysis, design tradeoffs and optimization,” SIGMETRICS Perform. Eval. Rev.,vol.41,no.1,pp.179–190,Jun.2013.[Online].Available: http://doi.acm.org/10.1145/2494232.2465546
    [8] W.-K. Ching, M. K. Ng, and W. Ching, Markov Chains: Models, Algorithms and Applications (International Series in Operations Research & Management Science). Berlin, Heidelberg: Springer-Verlag, 2006.
    [9] B. Van Houdt, “A mean field model for a class of garbage collection algorithms in flash-based solid state drives,” SIGMETRICS Perform. Eval. Rev., vol. 41, no. 1, pp. 191–202,Jun.2013.[Online].Available: http://doi.acm.org/10.1145/2494232.2465543 1
    [10] M. Benaim and J.-Y. Le Boudec, “A class of mean field interaction models for computer and communication systems,” vol. 65, 05 2008, pp. 589–590.
    [11] J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, Mar 1986. [Online]. Available: https://doi.org/10.1007/BF00116251
    [12] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001. [Online]. Available: https://doi.org/10.1023/A:1010933404324
    [13] C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no.3,pp.273–297,Sep 1995. [Online]. Available: https://doi.org/10.1007/BF00994018
    [14] D. F. Specht, “A general regression neural network,” IEEE Transactions on Neural Networks, vol. 2, no. 6, pp. 568–576, Nov 1991.
    [15] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regularization,” CoRR, vol. abs/1409.2329, 2014. [Online]. Available: http://arxiv.org/abs/1409.2329
    [16] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates,Inc.,2012,pp.1097–1105.[Online].Available: http://papers.nips.cc/paper/4824-imagenetclassification-with-deep-convolutional-neural-networks.pdf
    [17] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre, G. van den Driessche, T. Graepel, and D. Hassabis, “Mastering the game of go without human knowledge,” Nature, vol. 550, pp. 354 EP –, Oct 2017, article. [Online]. Available: https://doi.org/10.1038/nature24270
    [18] C. J. C. H. Watkins and P. Dayan, “Q-learning,” in Machine Learning, 1992, pp. 279–292.
    [19] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, pp. 529 EP –, Feb 2015. [Online]. Available: https://doi.org/10.1038/nature14236
    [20] J. K. Park and J. Kim, “A method for reducing garbage collection overhead of ssd using machine learning algorithms,” in 2017 International Conference on Information and Communication Technology Convergence (ICTC), Oct 2017, pp. 775–777.
    [21] W. Kang, D. Shin, and S. Yoo, “Reinforcement learning-assisted garbage collection to mitigate long-tail latency in ssd,” ACM Trans. Embed. Comput. Syst., vol. 16,no.5s,pp.134:1–134:20,Sep.2017.[Online].Available: http://doi.acm.org/10.1145/3126537
    [22] J. Lee, Y. Kim, G. M. Shipman, S. Oral, and J. Kim, “Preemptible i/o scheduling of garbage collection for solid state drives,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 32, no. 2, pp. 247–260, Feb 2013. [23] C. Lee, T. Kumano, T. Matsuki, H. Endo, N. Fukumoto, and M. Sugawara, “Understanding storage traffic characteristics on enterprise virtual desktop infrastructure,” in Proceedings of the 10th ACM International Systems and Storage Conference, ser. SYSTOR ’17. New York, NY, USA: ACM, 2017, pp. 13:1–13:11. [Online]. Available: http://doi.acm.org/10.1145/3078468.3078479
    [24] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. VanderPlas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in python,” CoRR, vol. abs/1201.0490, 2012. [Online]. Available: http://arxiv.org/abs/1201.0490
    [25] S. Behnel, R. Bradshaw, C. Citro, L. Dalcin, D. S. Seljebotn, and K. Smith, “Cython: The best of both worlds,” Computing in Science Engineering, vol. 13, no. 2, pp. 31–39, March 2011.

    無法下載圖示 全文公開日期 2024/08/21 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE