研究生: |
吳華軒 Hau-Shan Wu |
---|---|
論文名稱: |
針對固態硬碟所設計的消除重複資料存取架構 A DATA DE-DUPLICATION ACCESS FRAMEWORK FOR SOLID STATE DRIVES |
指導教授: |
吳晉賢
Chin-Hsien Wu |
口試委員: |
阮聖彰
Shanq-Jang Ruan 林昌鴻 Chang Hong Lin 陳維美 Wei-Mei Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2010 |
畢業學年度: | 98 |
語文別: | 英文 |
論文頁數: | 45 |
中文關鍵詞: | 快閃記憶體 、固態硬碟 、重復資料 、存取架構 |
外文關鍵詞: | Flash memory, Solid state drive, Duplicate, Framework |
相關次數: | 點閱:497 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著固態硬碟容量快速成長,許多在傳統硬碟上的應用已經被固態硬碟所取代。固態硬碟是由許多NAND快閃記憶體所組成,因為NAND快閃記憶體是一個對寫入指令敏感的記憶體,這個問題同時發生在固態硬碟中。因為快閃記憶體的非本地更新特性,大量的寫入指令會造成垃圾收集機制來回收記憶體中無效的頁面,頻繁的運行垃圾收集機制會降低使用壽命且降低整體效能。當固態硬碟使用在大量資料存取系統上,如何有效的減少資料寫入量是一個重要的課題。在這個研究中,我們提出一種針對固態硬碟所設計的消除重複資料存取架構,這個研究的目的為盡可能的消除重復資料來減少資料寫入量。我們會整合檔案層級的消除重複跟消除相似性重複來達到完整的資料消除重複機制,並且會運用重複資料的應用程式區域性以及檔案名稱區域性來增加重複資料的尋找準確性。根據實驗結果,我們的消除重複資料存取架構能夠有效的找出重複資料進而減少大量的資料寫入量,並且不會造成太多的系統效能耗損。
With the rapid development of SSDs (Solid State Drives), traditional
hard drives in many applications have been replaced by SSDs. Since
SSDs consist of NAND flash memory, the main challenge to SSDs is
that NAND flash memory is highly sensitive to write requests. A lot
of write requests will cause garbage collection to reclaim free
space due to the "out-place update" characteristic of flash memory.
Frequent activities of garbage collection will reduce the lifetime
of flash memory and overall performance. When SSDs are used for data
storage, how to significantly decrease the amount of data written
will become an important topic. In the thesis, we will propose a data
de-duplication access framework for SSDs. The objective is to
eliminate duplicate data as much as possible and reduce space
consumption. We will combine file-based de-duplication and static
chunking de-duplication schemes to reach a complete data
de-duplication. We will also investigate application-based locality
and file name locality to find out duplicate data. According to the
experimental results, the proposed framework can efficiently
identify duplicate data and decrease a lot of data written, and at
the same time, the overhead is also reasonable.
[1] Colossus: Ocz's 1tb solid state drive expected in stores this month.
http://www.gizmag.com/ocz-colossus-1tb-ssd/12399/.
[2] Flash memory. http://en.wikipedia.org/wiki/Flash memory.
[3] Benjie Chen Athicha Muthitacharoen and David Mazieres. A low-bandwidth
network file system. ACM Symposium on Operating Systems Principles archive
Proceedings of the eighteenth ACM symposium on Operating systems principles,
22:174-187, 2001.
[4] Seung Ho Lima and Kyu Ho Park b. Deffs duplicate data elimination for flash
memory file systems. In Korea Advanced Institute of Science and Technology,
2009.
[5] Microsoft. Single instance storage in microsoft windows storage server 2003 r2.
In Technical White Paper, 2006.
[6] Richard M. Karp and Michael O. Rabin. E±cient randomized pattern-matching
algorithms. IBM Journal of Research and Development, 31:249-260, 1987.
[7] Michael O. Rabin. Fingerprinting by random polynomials. IBM Journal of
Research and Development, 31:249-260, 1987.
[8] National Institute of Standards and Technology. Secure hash standard. Federal Information Processing Standards Publication, 1995.
[9] Andre Brinkmann. Data deduplication. In Theoretical Aspects of Storage Sys-
tems, 2009.
[10] Bo Hong and Darrell D. E. Long. Duplicate data elimination in a san file system. In Proceedings of the 21st IEEE (MSST), 2004.
[11] Deepak R. Bobbarjung, Suresh Jagannathan, and Cezary Dubnicki. Improving
duplicate elimination in storage systems. ACM Transactions on Storage, l.2:424-
448, 2006.
[12] J. et al Kubiatowicz. Oceanstore: An architecture for global store persistent storage. In In Proceedings of the Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, 2000.
[13] Robert Love. Linux Kernel Development. Novell Press, 2005.
[14] Technical and installation information on io profile.