研究生: |
蔡信宏 Hsin-Hung Tsai |
---|---|
論文名稱: |
ECDF問題應用於CPU-GPU架構之研究 Design and Implementation of the ECDF Query Problem on CPU-GPU Architectures |
指導教授: |
陳維美
Wei-Mei Chen |
口試委員: |
陳維美
Wei-Mei Chen 林昌鴻 Chang-Hong Lin 吳晉賢 Chin-Hsien Wu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 55 |
中文關鍵詞: | ECDF 、R-tree 、平行演算法 、GPGPU |
外文關鍵詞: | ECDF, R-tree, Parallel algorithm, GPGPU |
相關次數: | 點閱:295 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為了提升Empirical Cumulative Distribution Function(ECDF)的搜尋效率,於1980年代已有文獻探討平行演算法的可能性,但當時受限於科技發展,尚未有實驗驗證;而近幾年高速運算平台的普及,實驗環境已是研究者可以負擔,但至今尚未有人對該問題進行平行化的處理。為此我們運用Graphics Processing Unit(GPU)多個核心的優勢進行演算法設計,一次可處理多個搜尋要求。運用Static R-tree 的概念設計精簡的資料結構,找出搜尋點與Minimum Bounding Rectangle(MBR)的支配關係,簡化流程以提升搜尋效能。本研究提出新的葉節點整理方式、修改資料結構使R-tree擁有Complete Tree的特性、簡化搜尋的條件及符合硬體特性之遍歷演算法。實驗結果顯示,本論文有效提升ECDF的搜尋效能,並在R-tree整理方法較Hilbert curve、Z-order更加有優勢,與其他文獻提出的R-tree架構進行比較,效率上也有顯著的提升。
Since the 80s, there were journals discussing the possibility of using parallel algorithms on Empirical Cumulative Distribution Function (ECDF). However, due to the less development of science and technology at that time, there was no experimental verification. Although the high-speed computing platform become more prevalent over the last decade, there are no research on parallelizing the ECDF query problem yet so far. In this research, we use Graphics Processing Unit(GPU)as the parallelized platform. By taking advantage of multiple cores, GPU can handle heavy requests, thus improve query efficiency. We design a reduced data structure by improving the concept of Static R-tree, and find out the dominating relationship between query points and Minimum Bounding Rectangle (MBR), and then simplify the querying process. Our research propose a new leaf node arranging method, modify the data structure to give R-tree the characteristics of Complete Tree, simplify query condition, and make good use of shared memory with the traversal algorithm. The experimental results show that the method we proposed has a huge improvement upon the ECDF query problem, and that the leaf node arranging method is more beneficial than space filling curve, e.g., Hilbert Curve and Z-order. Also, the R-tree structure we proposed is more efficiency than other research.
[1] S. Borzsony, D. Kossmann, and K. Stocker, “The skyline operator,” in Proceedings 17th international conference on data engineering, pp. 421–430, IEEE, 2001.
[2] C.-Y. Chan, H. Jagadish, K.-L. Tan, A. K. Tung, and Z. Zhang, “Finding k-dominant skylines in high dimensional space,” in Proceedings of the 2006 ACM SIGMOD international conference on Management of data, pp. 503–514, 2006.
[3] X. Lin, Y. Yuan, Q. Zhang, and Y. Zhang, “Selecting stars: The k most representative skyline operator,” in 2007 IEEE 23rd International Conference on Data Engineering, pp. 86–95, IEEE, 2007.
[4] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive skyline computation in database systems,” ACM Transactions on Database Systems (TODS), vol. 30, no. 1, pp. 41–82, 2005.
[5] M. J. Flynn, “Some computer organizations and their effectiveness,” IEEE transactions on computers, vol. 100, no. 9, pp. 948–960, 1972.
[6] NVIDIA, “CUDA C+ + Programming Guide.” https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#features-and-technical-specifications, 2021.
[7] A. Guttman, “R-trees: A dynamic index structure for spatial searching,” in Proceedings of the 1984 ACM SIGMOD international conference on Management of data, pp. 47–57, 1984.
[8] N. Roussopoulos and D. Leifker, “Direct spatial search on pictorial databases using packed r-trees,” in Proceedings of the 1985 ACM SIGMOD international conference on Management of data, pp. 17–31, 1985.
[9] I. Kamel and C. Faloutsos, “Hilbert r-tree: An improved r-tree using fractals,” tech. rep., 1993.
[10] S. T. Leutenegger, M. A. Lopez, and J. Edgington, “Str: A simple and efficient algorithm for r-tree packing,” in Proceedings 13th International Conference on Data Engineering, pp. 497–506, IEEE, 1997.
[11] L. Luo, M. D. Wong, and L. Leong, “Parallel implementation of r-trees on the gpu,” in 17th Asia and South Pacific Design Automation Conference, pp. 353–358, IEEE, 2012.
[12] J. Kim, W.-K. Jeong, and B. Nam, “Exploiting massive parallelism for indexingmulti-dimensional datasets on the gpu,” IEEE Transactions on Parallel and Distributed Systems, vol. 26, no. 8, pp. 2258–2271, 2014.
[13] S. K. Prasad, M. McDermott, X. He, and S. Puri, “Gpu-based parallel r-tree construction and querying,” in 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, pp. 618–627, IEEE, 2015.
[14] M. Kim, L. Liu, and W. Choi, “A gpu-aware parallel index for processing highdimensional big data,” IEEE Transactions on Computers, vol. 67, no. 10, pp. 1388–1402, 2018.
[15] S. You, J. Zhang, and L. Gruenwald, “Parallel spatial query processing on gpus using r-trees,” in Proceedings of the 2Nd ACM SIGSPATIAL international workshop on analytics for big geospatial data, pp. 23–31, 2013.
[16] Z. N. Lewis and Y.-C. Tu, “G-pics: A framework for gpu-based spatial indexing and query processing,” IEEE Transactions on Knowledge and Data Engineering, 2020.
[17] J. Qi, Y. Tao, Y. Chang, and R. Zhang, “Packing r-trees with space-filling curves: Theoretical optimality, empirical efficiency, and bulk-loading parallelizability,” ACM Transactions on Database Systems (TODS), vol. 45, no. 3, pp. 1–47, 2020.
[18] J. L. Bentley, “Multidimensional divide-and-conquer,” Communications of the ACM, vol. 23, no. 4, pp. 214–229, 1980.
[19] F. Dehne, “O (n12) algorithms for the maximal elements and ecdf searching problem on a mesh-connected parallel computer,” Information Processing Letters, vol. 22, no. 6, pp. 303–306, 1986.
[20] F. Dehne and I. Stojmenovic, “An o (√ n) time algorithm for the ecdf searching problem for arbitrary dimensions on a mesh-of-processors,” Information processing letters, vol. 28, no. 2, pp. 67–70, 1988.
[21] J. Petersson, “Parallel algorithms for geometric dominance problems,” Parallel Algorithms: Third DIMACS Implementation Challenge, October 17-19, 1994, vol. 30, p. 97, 1997.
[22] W.-M. Chen and G.-H. Chen, “Divide-and-conquer recurrences associated with generalized heaps, optimal merge, and related structures,” Theoretical computer science, vol. 292, no. 3, pp. 667–677, 2003.
[23] N. Govender, D. N. Wilke, S. Kok, and R. Els, “Development of a convex polyhedral discrete element simulation framework for nvidia kepler based gpus,” Journal of Computational and Applied Mathematics, vol. 270, pp. 386–400, 2014.
[24] J. T. Robinson, “The kdb-tree: a search structure for large multidimensional dynamic indexes,” in Proceedings of the 1981 ACM SIGMOD international conference on Management of data, pp. 10–18, 1981.