研究生: |
黃正杰 Cheng-Chieh Huang |
---|---|
論文名稱: |
針對Cassandra資料庫之混合型儲存式系統 A Hybrid Storage System For Cassandra Databases |
指導教授: |
吳晋賢
Chin-Hsien Wu |
口試委員: |
阮聖彰
Shanq-Jang Ruan 陳維美 Wei-Mei Chen 吳晋賢 Chin-Hsien Wu 陳雅淑 Ya-Shu Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 45 |
中文關鍵詞: | 資料庫 、混合型儲存式系統 、固態硬碟 |
外文關鍵詞: | Cassandra, Hybrid Storage, database |
相關次數: | 點閱:162 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
相對於傳統硬碟,固態硬碟能夠提供更好的I/O效能,尤其在應用程式需要大量I/O效能時,我們可以將相對常用的資料搬至固態硬碟,以提升系統整體效能,但是考慮到在價格上固態硬碟比傳統硬碟貴,我們需要在搬移資料上以最低的成本達成最高的效能提升,因此我們設計一個針對Cassandra的儲存式系統,熱資料能夠有效的在適當時間搬移至固態硬碟,當資料很久沒被使用時,也會被搬移至傳統硬碟。
在本篇論文中,我們提出一個藉由更改Cassandra非關聯式資料庫系統架構,將常用的資料放至固態硬碟,不常用的資料放至傳統硬碟,利用固態硬碟高速存取的特性提升效能,利用傳統硬碟價格低的優點降低硬體成本。我們將在第五章節以實驗來說明我們所提出的方法可以達到此目的。
Compared with traditional hard disks (HDD), solid-state drive (SSD) can offer more I/O efficiency, especially when applications need high I/O performance. We can improve I/O efficiency by moving frequently-used data to SSD. However, SSD is more expensive then HDD and we want to use limited SSD space to get reasonable I/O performance. To achieve this goal, we design a hybrid storage system for Cassandra databases. In this thesis, we modify a NoSQL Cassandra database by implementing a hybrid storage system. We place high-priority data in SSD with fast access property and put low-priority data in HDD with low cost. The experimental results also show that our proposed method can achieve the goal.
[1] S. Cho, S. Chang, and I. Jo, "The Solid-State Drive Technology, Today and Tomorrow," in IEEE 31st International Conference on Data Engineering, pp. 1520-1522, April 2015.
[2] M. Canim, G. A. Mihaila, B. Bhattacharjee, K. A. Ross, and C. A. Lang, "An Object Placement Advisor for DB2 Using Solid State Storage," in Proceedings of the VLDB Endowment, vol. 2, pp. 1318–1329, Aug 2009.
[3] J. Schindler, A. Ailamaki, and G. R. Ganger, "Matching Database Access Patterns to Storage Characteristics," in In FAST ’02: Proceedings of the 1st USENIX Conference on Fileand Storage Technologies, page 22, 2002.
[4] Oracle, "Take the Guesswork Out of Database Layout and I/O Tuning with Automatic Storage Management," in Oracle Technical White Paper, December 2005.
[5] A. Sachedina, M. Huras, and A. Colangelo, "Best Practices Database," in White paper, IBM DB2 for Linux, UNIX, and Windows, Oct 2008.
[6] M. Canim, G. A. Mihaila, B. Bhattacharjee, K. A. Ross, and C. A. Lang, "SSD Bufferpool Extensions for Database Systems," in Proceedings of the VLDB Endowment, vol. 3, pp. 1435–1446, September 2010.
[7] C.-K. Kang, Y.-J. Cai, C.-H. Wu, and P.-C. Hsiu, "A Hybrid Storage Access Framework for High-Performance Virtual Machines," in ACM Transactions on Embedded Computing Systems (TECS), vol. 13, no. 5s, November 2014.
[8] H. Shi, R. V. Arumugam, C. H. Foh, and K. K. Khaing, "Optimal Disk Storage Allocation for Multitier Storage System," in IEEE Transactions on Magnetics, vol. 49, no. 6, June 2013.
[9] L. Lin, Y. Zhu, J. Yue, Z. Cai, and B. Segee, "Hot Random Off-Loading: A Hybrid Storage System with Dynamic Data Migration," in 19th Annual IEEE International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 318–325, July 2011.
[10] J. Ou, J. Shu, Y. Lu, L. Yi, , and W. Wang, "EDM: An Endurance-Aware Data Migration Scheme for Load Balancing in SSD Storage Clusters," in IEEE International Parallel and Distributed Processing Symposium, pp. 787–796, May 2014.
[11] J. Choi, B. Lee, D. Jung, and H. Y. Youn, "An SSD-Based Accelerator Using Partitioned Bloom Filter for Directory Parsing," in IEEE International Conference on IT Convergence and Security, pp. 1–5, August 2015.
[12] D. Park and D. H. Du, "Hot Data Identification for Flash-based Storage Systems Using Multiple Bloom Filters," in Mass Storage Systems and Technologies (MSST), pp. 1–11, May 2011.
[13] C.-H. Wu, P.-H. Wu, K.-L. Chen, W.-Y. Chang, and K.-C. Lai, "A Hotness Filter of Files for Reliable Non-Volatile Memory Systems," in IEEE Transactions on Dependable and Secure Computing, vol. 12, July 2015.
[14] D. Park, "Hot and cold data identification: Applications to storage," in Ph.D. dissertation, University of Minnesota , 2012.
[15] B. H. Bloom, "Space/Time Trade-offs in Hash Coding with Allowable Errors," in Communications of the ACM, vol. 13, no. 7, July 1970 .
[16] T. Kgil, D. Roberts and T. Mudge, "Improving NAND Flash Based Disk Caches," in ACM ISCA, 2008.
[17] S.F. Hsiao, P.-C. Hsiu and C. Hsiu and T.-W. Kuo, "A Reconfigurable Virtual Storage Device," in IEEE ISORC, 2009.
[18] F. Chen, D. A. Koufaty and X. Zhang, "Hystor: Making the Best Use of Solid State Drives in High Performance Storage Systems," in ACM ICS, 2011.
[19] Q. Yang and J. Ren, "I-CASH: Intelligently Coupled Array of SSD and HDD," in IEEE HPCA, 2011.
[20] J. Bhogal and I. Choksi, "Handling Big Data using NoSQL," in 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 393-398, March 2015.
[21] G. C. Deka, "A Survey of Cloud Database Systems," in IT Professional, vol. 16, no. 2, pp. 50-57, March/April 2014.
[22] P. P. Srivastava, S. Goyal, and A. Kumar, "Analysis of Various NoSql Database," in International Conference on Green Computing and Internet of Things (ICGCIoT), pp. 539-544, October 2015.
[23] Apache Labs, "Cassandra," [Online]. Available: http://cassandra.apache.org/.
[24] J. Han, H. E, G. Le, and J. Du, "Survey on NoSQL Database," in 2011 6th International Conference on Pervasive Computing and Applications (ICPCA), pp. 363 - 366, October 2011.
[25] A. Chebotko, A. Kashlev, and S. Lu, "A Big Data Modeling Methodology for Apache Cassandra," in 2015 IEEE International Congress on Big Data, pp. 238 - 245, 2015.
[26] DataStax, Inc, "Apache Cassandra™ 2.1 Documentation," 2016. [Online]. Available: http://docs.datastax.com/en/cassandra/2.1/pdf/cassandra21.pdf.
[27] P. Menon, T. Rabl, M. Sadoghi, and H.-A. Jacobsen, "CaSSanDra: An SSD Boosted Key-Value Store," in 2014 IEEE 30th International Conference on Data Engineering, pp. 1162 - 1167, 2014.
[28] 張鎮宇, “在混合型儲存裝置基於優先權決策針對非關聯式資料庫系統之物件搬移設計,” 於 碩士論文, 台灣科技大學, 2016.
[29] 黃政偉, “以優先權為基礎的資料管理方法針對使用固態硬碟的資料庫系統,” 於 碩士論文, 台灣科技大學, 2014.
[30] J. Do, D. Zhang, J. M. Patel, D. J. DeWitt, J. F. Naughton, and A. Halverson, "Turbocharging DBMS Buffer Pool Using SSDs," in SIGMOD '11 Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp. 1113-1124 , 2011.
[31] B. F. Cooper, A. Siberstein, E. Tam, R. Ramakrishnan, and R. Sears, "Benchmarking Cloud Serving Systems with YCSB," in SoCC '10: Proceedings of the 1st ACM symposium on Cloud computing, pp. 143-154, June 2010.
[32] M. Barata, J. Bernardino, and P. Furtado, "YCSB and TPC-H: Big Data and Decision Support Benchmarks," in IEEE International Congress on Big Data, pp. 800-801, 2014.