研究生: |
蘇易騰 Yi-Teng Shu |
---|---|
論文名稱: |
天際線查詢平行化之研究 A Study on Parallel Skyline Query |
指導教授: |
陳維美
Wei-Mei Chen |
口試委員: |
林敬舜
Ching-Shun Lin 林昌鴻 Chang-Hong Lin 吳晉賢 Chin-Hsien Wu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 中文 |
論文頁數: | 64 |
中文關鍵詞: | 天際線查詢 、平行化運算 、資料庫 |
外文關鍵詞: | skyline query, parallel computation, database |
相關次數: | 點閱:320 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
天際線查詢(skyline query)是一種高效率多準則決策的資料分析工具,並且在資料庫的領域中受到相當多的關注,且將天際線查詢納為基本語法之一。在多核心處理器的發展潮流下,我們提出平行化天際線查詢演算法PSQ,支援漸進性(progressively)輸出,且適用於共享記憶體(shared memory)架構。PSQ演算法以維度(dimension)大小排序每筆資料(point),作個簡單分類,再以分隔點(terminating point)機制減少資料間的比較,可大幅提升運算效率。我們以四種不同的軟硬體組合實作PSQ演算法。從實驗結果得知,PSQ演算法在這四種模擬環境中,維度個數多或資料量大時,都能有效利用多核心達到顯著的平行化效果。
The skyline query is an efficient data analysis tool for multi-criteria decision making that has received significant attention in the database community. With the advent of multi-core processors, we present a new parallel skyline query algorithm PSQ that can be applied to shared memory architectures, to progressively return skyline points as they are identified. The PSQ algorithm sorts the given dataset in descending order for coordinate component separately, and then finds out a terminating point for eliminating redundant computations of skyline query. Experimental results show that the PSQ algorithm successfully uses multiple cores to improve performance when dimensionality of the dataset grows or number of data points is large.
[1] G. M. Amdahl, “Validity of the Single-Processor Approach to Achieving Large Scale Computing Capabilities,” AFIPS Conference Proceedings, 1967. pp 483 – 485.
[2] S. Borzsonyi, D. Kossmann and K. Stocker, “The skyline operator,” Proceedings of the 17th International Conference on Data Engineering, 2001. pp 421–430.
[3] W. T. Balke, U. Guntzer and J. X. Zheng, “Efficient Distributed Skylining for Web Information Systems,” Proceedings of the 9th International Conference on Extending Database Technology, 2004. pp 256 - 273.
[4] C. Y. Chan, H. V. Jagadish, K. L. Tan, A. K. Tung and Z. Zhang, “Finding k-dominant skylines in high dimensional space,” Proceedings of the ACM SIGMOD International Conference on Management of Data, 2006. pp 503 – 514.
[5] C. Y. Chan, H. V. Jagadish, K. L. Tan, A. K. Tung and Z. Zhang, “On high dimensional skylines,” Proceedings of the 10th International Conference on Extending Database Technology, 2006. pp 478 – 495.
[6] J. Chomicki, P. Godfrey, J. Gryz and D. Liang, “Skyline with Presorting,” Proceedings of the 19th International Conference on Data Engineering, 2003. pp 717 – 719.
[7] W. M. Chen, H. K. Hwang and T. H. Tsai, “Maxima-finding algorithms for multidimensional samples: A two-phase approach,” Computational Geometry: Theory and Applications, 2012. pp 33 – 53.
[8] J. Diaz, C. Munoz-Caro and A. Nino, “A Survey of Parallel Programming Models and Tools in the Multi and Many-core Era,” IEEE Transactions on Parallel and Distributed Systems, 2011.
[9] L. Dagum and R. Menon, ”OpenMP: an industry-standard API for shared- memory programming,” IEEE Computational Science and Engineering, 1998. pp 46 – 55.
[10] K. Fuerlinger and M. Gerndt, “Analyzing overheads and scalability characteristics of OpenMP applications,” Proceedings of the 7th International Meeting on High Performance Computing for Computational Science,” 2006. pp 39 – 51.
[11] A. Guttman, “R-trees: a dynamic index structure for spatial searching,” Proceedings of the ACM SIGMOD International Conference on Management of Data, 1984. pp 47 – 57.
[12] W. Gropp, S. H. Lederman, A. Lumsdaine, E. Lusk, B. Nitzberg, W. Saphir and M. Snir, “MPI: The Complete Reference, 2nd Edition, Volume 2 - The MPI-2 Extensions,” The MIT Press, 1998.
[13] Z. Huang, C. S. Jensen, H. Lu and B. C. Ooi, “Skyline Queries Against Mobile Lightweight Devices in MANETs,” Proceedings of the International Conference on Data Engineering, 2006. pp 210 – 219.
[14] H. Im, J. Park and S. Park, “Parallel skyline computation on multicore architecture,” Information Systems, 2011. pp 808 - 823.
[15] I. Jung, J. Hyun and J. Lee, ”A scheduling policy for preserving cache locality in a
multiprogrammed system,” Journal on Systems Architecture, 2000. pp 1191 – 1204.
[16] H. Kung, F. Luccio and F. Preparata, “On finding the maxima of a set of vectors,” Journal of the ACM, 1975. pp 469 – 476.
[17] D. Kossmann, F. Ramsak and S. Rost, “Shooting stars in the sky: an online algorithm for skyline queries,” Proceedings of 28th International Conference on Very Large Data Bases, 2002. pp 275–286.
[18] D. H. McLain, “Drawing contours from arbitrary data points,” The Computer Journal, Nov. 1974.
[19] F. P. Preparata and M. I. Shamos, “Computational Geometry: An Introduction,” Springer-Verlag, 1985.
[20] D. Papadias, Y. Tao, G. Fu and B. Seeger, “Progressive skyline computation in database systems,” ACM Transactions on Database Systems, 2005. pp 41 – 82.
[21] J. Selke, C. Lofi and W. T. Balke, “Highly Scalable Multiprocessing Algorithms for Preference-Based Database Retrieval,” Proceedings of the 15th International Conference on Database Systems for Advanced Applications, 2010. pp 246 – 260.
[22] K. Tan, P. Eng and B. C. Ooi, “Efficient progressive skyline computation,” Proceedings of the 27th International Conference on Very Large Data Bases, 2001. pp 301–310.
[23] A. Vlachou, C. Doulkeridis, Y. Kotidis and M. Vazirgiannis, “Skypeer: Efficient Subspace Skyline Computation over Distributed Data,” Proceedings of the International Conference on Data Engineering, 2007. pp 416 - 425.
[24] S. Wang, B. C. Ooi, A.K.H. Tung and L. Xu, “Efficient Skyline Query Processing on Peer-to-Peer Networks,” Proceedings of the International Conference on Data Engineering, 2007. pp 1126 - 1135.
[25] P. Wu, C. Zhang and Y. Feng, “Parallelizing Skyline Queries for Scalable Distribution,” Proceedings of the International Conference on Extending Database Technology, 2005. pp 112 – 130.
[26] Z. Zhang, X. Guo, H. Lu, A. Tung and N. Wang, “Discovering strong skyline points in high dimensional spaces,” Proceedings of the 14th ACM International Conference on Information and Knowledge Management, 2005. pp 247 - 248.
[27] L. Zhu, Y. Tao and S. Zhou, “Distributed skyline retrieval with low bandwidth consumption,” IEEE Transactions on Knowledge and Data Engineering, 2009. pp 384 – 400.