簡易檢索 / 詳目顯示

研究生: 張鎮宇
Chen - Yu Chang
論文名稱: 在混合型儲存裝置基於優先權決策針對非關聯式資料庫系統之物件搬移設計
A Priority-based Object Migration Design for NoSQL Databases in a Hybrid Storage System
指導教授: 吳晉賢
Chin-Hsien Wu
口試委員: 林淵翔
Yuan-Hsiang Lin
林昌鴻
Chang Hong Lin
沈中安
Chung-An Shen
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 56
中文關鍵詞: 資料庫固態硬碟資料管理
外文關鍵詞: Database, Solid-State Drives, Data Management
相關次數: 點閱:216下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 當應用程式需要大量的I/O效能時,相對於傳統的硬碟,固態硬碟能提供更好的效能,所以可以藉由將經常被使用的資料搬移到固態硬碟,以提升系統效能,但是固態硬碟在單位價格上比傳統硬碟貴上許多,因此我們必須在傳統硬碟與固態硬碟之間設計一個資料搬移機制。
    在本篇論文中,我們提出一個在混合型儲存裝置基於優先權決策針對非關聯式資料庫系統的物件搬移設計,該方法能藉由將高權重值的資料放到固態硬碟以利用其快速存取的特性,且放置低權重值的資料於傳統硬碟以降低硬體成本,進而改善非關聯式資料庫系統的效能。我們將實作實驗去說明我們所提出的方法可以達到此目的。


    When applications need a large amount of I/O, SSDs can provide a better performance than HDDs. As a result, data with frequent accesses could be allocated to SSDs in order to improve performance. However, the SSDs’ price is more expensive than HDDs, so we have to make proper data migration between HDDs and SSDs. In the thesis, we will propose a priority-based object migration design for NoSQL databases in a hybrid storage system. The method can place high-priority data in SSDs with fast access property and allocate low-priority data to HDDs with low cost to improve the performance of NoSQL databases. The experimental results also show that the proposed method can achieve the goal.

    第一章緒論 1.1前言 1.2論文架構 第二章環境背景 2.1資料配置的方法(Data Placement Method) 2.2熱資料的判定(Hot Data Identification) 2.3混合型儲存裝置(Hybrid Storage System) 2.4NoSQL與CAP Theorem 2.5Apache Cassandra 第三章研究動機與相關研究 第四章基於優先權決策的資料搬移設計 4.1系統概述 4.2物件的放置(Placement of Objects) 4.3權重值的決定(Priority Decision) 第五章實驗與效能分析 5.1概述 5.2Yahoo! Cloud Service Benchmark 5.3效能分析 5.3.1Workload A 5.3.2Workload B 5.3.3Workload C 5.3.4Workload D 5.3.5Workload F 第六章結論

    [1] S. Cho, S. Chang, and I. Jo, "The Solid-State Drive Technology, Today and Tomorrow," in IEEE 31st International Conference on Data Engineering, pp. 1520-1522, April 2015.
    [2] J. Schindler, A. Ailamaki, and G. R. Ganger, "Matching Database Access Patterns to Storage Characteristics," in In FAST ’02: Proceedings of the 1st USENIX Conference on Fileand Storage Technologies, page 22, 2002.
    [3] M. Canim, G. A. Mihaila, B. Bhattacharjee, K. A. Ross, and C. A. Lang, "An Object Placement Advisor for DB2 Using Solid State Storage," in Proceedings of the VLDB Endowment, vol. 2, pp. 1318–1329, Aug 2009.
    [4] M. Mehta and D. J. DeWitt, "Data placement in shared-nothing parallel database systems," in The VLDB Journal, vol. 6, pp. 53–72, February 1997.
    [5] Oracle, "Take the Guesswork Out of Database Layout and I/O Tuning with Automatic Storage Management," in Oracle Technical White Paper, December 2005.
    [6] A. Sachedina, M. Huras, and A. Colangelo, "Best Practices Database," in White paper, IBM DB2 for Linux, UNIX, and Windows, Oct 2008.
    [7] M. Canim, G. A. Mihaila, B. Bhattacharjee, K. A. Ross, and C. A. Lang, "SSD Bufferpool Extensions for Database Systems," in Proceedings of the VLDB Endowment, vol. 3, pp. 1435–1446, September 2010.
    [8] H. Shi, R. V. Arumugam, C. H. Foh, and K. K. Khaing, "Optimal Disk Storage Allocation for Multitier Storage System," in IEEE Transactions on Magnetics, vol. 49, no. 6, June 2013.
    [9] L. Lin, Y. Zhu, J. Yue, Z. Cai, and B. Segee, "Hot Random Off-Loading: A Hybrid Storage System with Dynamic Data Migration," in 19th Annual IEEE International Symposium on Modelling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 318–325, July 2011.
    [10] C.-K. Kang, Y.-J. Cai, C.-H. Wu, and P.-C. Hsiu, "A Hybrid Storage Access Framework for High-Performance Virtual Machines," in ACM Transactions on Embedded Computing Systems (TECS), vol. 13, no. 5s, November 2014.
    [11] J. Ou, J. Shu, Y. Lu, L. Yi, , and W. Wang, "EDM: An Endurance-Aware Data Migration Scheme for Load Balancing in SSD Storage Clusters," in IEEE International Parallel and Distributed Processing Symposium, pp. 787–796, May 2014.
    [12] C. Xu, W. Wang, D. Zhou, and T. Xie, "An SSD-HDD Integrated Storage Architecture for Write-Once-Read-Once Applications on Clusters," in IEEE International Conference on Cluster Computing, pp. 74–77, September 2015.
    [13] J. Choi, B. Lee, D. Jung, and H. Y. Youn, "An SSD-Based Accelerator Using Partitioned Bloom Filter for Directory Parsing," in IEEE International Conference on IT Convergence and Security, pp. 1–5, August 2015.
    [14] D. Park and D. H. Du, "Hot Data Identification for Flash-based Storage Systems Using Multiple Bloom Filters," in Mass Storage Systems and Technologies (MSST), pp. 1–11, May 2011.
    [15] D. Park, "Hot and cold data identification: Applications to storage," in Ph.D. dissertation, University of Minnesota , 2012.
    [16] C.-H. Wu, P.-H. Wu, K.-L. Chen, W.-Y. Chang, and K.-C. Lai, "A Hotness Filter of Files for Reliable Non-Volatile Memory Systems," in IEEE Transactions on Dependable and Secure Computing, vol. 12, July 2015.
    [17] B. H. Bloom, "Space/Time Trade-offs in Hash Coding with Allowable Errors," in Communications of the ACM, vol. 13, no. 7, July 1970 .
    [18] A. Leventha, "Flash Storage Memory," in Communications of the ACM, vol. 52, no. 7, pp. 47-51, 2008.
    [19] J. Matthews, S. Trika, D. Hensgen, R. Coulson and K. Grimsrud, "Intel® Turbo Memory: Nonvolatile disk caches in the storage hierarchy of mainstream computer systems," in ACM Transactions on Storage, May 2008.
    [20] T. Kgil, D. Roberts and T. Mudge, "Improving NAND Flash Based Disk Caches," in ACM ISCA, 2008.
    [21] S.F. Hsiao, P.-C. Hsiu and C. Hsiu and T.-W. Kuo, "A Reconfigurable Virtual Storage Device," in IEEE ISORC, 2009.
    [22] Q. Yang and J. Ren, "I-CASH: Intelligently Coupled Array of SSD and HDD," in IEEE HPCA, 2011.
    [23] F. Chen, D. A. Koufaty and X. Zhang, "Hystor: Making the Best Use of Solid State Drives in High Performance Storage Systems," in ACM ICS, 2011.
    [24] S. Abiteboul, "Querying Semi-Structured Data," in ICDT '97 Proceedings of the 6th International Conference on Database Theory, pp. 1-18, 1997.
    [25] J. Bhogal and I. Choksi, "Handling Big Data using NoSQL," in 29th International Conference on Advanced Information Networking and Applications Workshops (WAINA), pp. 393-398, March 2015.
    [26] S. Huang, L. Cai, Z. Liu, and Y. Hu, "Non-structure Data Storage Technology : A Discussion," in IEEE/ACIS 11th International Conference on Computer and Information Science, pp. 482 - 487 , 2012.
    [27] G. C. Deka, "A Survey of Cloud Database Systems," in IT Professional, vol. 16, no. 2, pp. 50-57, March/April 2014.
    [28] P. P. Srivastava, S. Goyal, and A. Kumar, "Analysis of Various NoSql Database," in International Conference on Green Computing and Internet of Things (ICGCIoT), pp. 539-544, October 2015.
    [29] S. Gilbert and N. Lynch, "Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services," in ACM SIGACT News, vol. 33, no. 2, pp. 51-59, June 2002.
    [30] J. Han, H. E, G. Le, and J. Du, "Survey on NoSQL Database," in 2011 6th International Conference on Pervasive Computing and Applications (ICPCA), pp. 363 - 366, October 2011.
    [31] Apache Labs, "Cassandra," [Online]. Available: http://cassandra.apache.org/.
    [32] A. Chebotko, A. Kashlev, and S. Lu, "A Big Data Modeling Methodology for Apache Cassandra," in 2015 IEEE International Congress on Big Data, pp. 238 - 245, 2015.
    [33] DataStax, Inc, "Apache Cassandra™ 2.1 Documentation," 2016. [Online]. Available: http://docs.datastax.com/en/cassandra/2.1/pdf/cassandra21.pdf.
    [34] P. Menon, T. Rabl, M. Sadoghi, and H.-A. Jacobsen, "CaSSanDra: An SSD Boosted Key-Value Store," in 2014 IEEE 30th International Conference on Data Engineering, pp. 1162 - 1167, 2014.
    [35] 黃政偉, “以優先權為基礎的資料管理方法針對使用固態硬碟的資料庫系統,” 於 碩士論文, 台灣科技大學, 2014.
    [36] J. Do, D. Zhang, J. M. Patel, D. J. DeWitt, J. F. Naughton, and A. Halverson, "Turbocharging DBMS Buffer Pool Using SSDs," in SIGMOD '11 Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp. 1113-1124 , 2011.
    [37] J. W. Hsieh, L. P. Chang, and T. W. Kuo, "Efficient Identification of Hot Data for Flash Memory Storage Systems," in ACM Transactions on Storage, vol. 2, no. 1, 2006.
    [38] V. Abramova and J. Bernardino, "NoSQL databases: MongoDB vs Cassandra," in C3S2E '13: Proceedings of the International C* Conference on Computer Science and Software Engineering, pp. 14-22, July 2013.
    [39] B. F. Cooper, A. Siberstein, E. Tam, R. Ramakrishnan, and R. Sears, "Benchmarking Cloud Serving Systems with YCSB," in SoCC '10: Proceedings of the 1st ACM symposium on Cloud computing, pp. 143-154, June 2010.
    [40] M. Barata, J. Bernardino, and P. Furtado, "YCSB and TPC-H: Big Data and Decision Support Benchmarks," in IEEE International Congress on Big Data, pp. 800-801, 2014.
    [41] H. Zhang, G. Chen, B. C. Ooi, K. L. Tan, M. Zhang, "In-Memory Big Data Management and Processing: A Survey," in IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 7, pp. 1920-1948, 2015.
    [42] R. Nair, "Evolution of Memory Architecture," in Proceedings of the IEEE, vol. 103, no. 8, pp. 1331-1345, 2015.
    [43] V. G. Castellana, A. Morari, J. Weaver, A. Tumeo, and D. Haglin, "In-Memory Graph Databases for Web-Scale Data," in Computer, vol. 48, no. 3, pp. 24-35, 2015.

    QR CODE