簡易檢索 / 詳目顯示

研究生: 蔣昌霖
Chang-Lin Chiang
論文名稱: 在MapReduce中垂直式天際線查詢機制之研究
Vertical-Based Processing of Skyline Queries in MapReduce
指導教授: 陳省隆
Hsing-Lung Chen
口試委員: 呂政修
Jenq-Shiou Leu
陳郁堂
Yie-Tarng Chen
莊博任
Po-Jen Chuang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 79
中文關鍵詞: 天際線查詢MapReduce角錐式分割
外文關鍵詞: Skyline query, MapReduce, pyramid-based
相關次數: 點閱:233下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Skyline query在近十年來受到廣泛的關注,其原因是因為它可以運用在很多應用上,尤其是在處理多決策問題時,但是在現在資訊爆炸的時代中,skyline query處理大量資料時,會非常耗費時間。對於處理大量數據的問題,Hadoop的MapReduce框架非常擅長處理數據密集型的問題,這篇論文中,我們使用MapReduce框架提出了一個角錐式分割方法,來處理skyline query問題。
    在角錐式分割方法中,我們能夠使得數據量負載平衡,以確保在MapReduce平行處理的架構中,更快速的完成skyline query。


    In recent years, skyline queries have received extensive attentions. It has many applications specialized on multi-criteria decision problems, such as recommender systems, confident search. The main approach for speeding up the processing of skyline queries is to filter out most of big data in the processing. Some researchers proposed grid-based partitioning method for data partitioning. However, it filters out less data, resulting in reducing its performance. Other researchers proposed the angle-based partitioning method for data partitioning, which can filter out more data. However, it cannot derive more partitions and needs much more computations for deriving the location of a data point, resulting in degrading its performance.
    The aim of this proposal is to propose a pyramid-based processing of skyline queries in MapReduce for speeding up the skyline queries. The proposed pyramid-based partitioning method can easily partition the hypercube into many small pyramids and quickly identify the location of any data point. The local skyline points can be derived swiftly by using two-level BNL method. Furthermore, the dominated points can be found quickly by employing vertical-dominated method such t that the global skyline points are derived speedily. Thus this pyramid-based fashion of proposed algorithm can have highly parallel processing and filter out the data quickly, resulting in speeding up the system performance significantly.

    目錄 摘要 2 Chapter 1 緒論 11 1.1 研究背景 11 1.2 研究目的 12 1.3 應用 13 Chapter 2相關研究 16 2.1 Hadoop簡介 16 2.2 MapReduce 17 2.3 Yarn 18 2.4 Hadoop MapReduce 詳細過程 20 2.5文獻探討 22 2.5.1 SKY-MR 演算法 22 2.5.2 MR-GPMRS演算法 25 2.5.3 PPF-PGPS演算法 27 2.5.4 SKY-MR+演算法 28 Chapter 3 角錐式分割演算法 30 3.1 演算法設計 30 3.1.1 角錐式分割 30 3.1.2 區域過濾 35 3.1.3 雙層BNL 36 3.1.4 虛擬極小點 38 3.1.5 強壯過濾 39 3.1.7 垂直主宰法 39 3.2 演算法流程 42 3.2.1 Local skyline processing 43 3.2.2 Preprocessing 48 3.2.3 Global skyline processing 48 3.2.4 Post processing 51 Chapter 4 實驗與分析 52 4.1 實驗環境 52 4.2 測試數據 53 4.3 實驗參數 57 4.4 實驗數據 71 Chapter 5 結論與未來展望 77

    [1] J. Lee, S. Hwang, Z. Nie, and J. Wen, "Navigation system for product search," in 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010), 2010, pp. 1113-1116.
    [2] T. Lappas and D. Gunopulos, "Efficient Confident Search in Large Review Corpora," in ECMLPKDD, 2010.
    [3] L. Zou, L. Chen, M. T. Özsu, and D. Zhao, "Dynamic Skyline Queries in Large Graphs," in Database Systems for Advanced Applications, Berlin, Heidelberg, 2010, pp. 62-78: Springer Berlin Heidelberg.
    [4] K. Mullesgaard, J. L. Pederseny, H. Lu, and Y. Zhou, "Efficient Skyline Computation in MapReduce," in EDBT, 2014.
    [5] (2012). Big data meets big data analytics [Online]. Available: Available: http://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/bigdata-meets-big-data-analytics-105777.pdf
    [6] Y. Park, J.-K. Min, and K. Shim, "Parallel computation of skyline and reverse skyline queries using mapreduce " Proc. VLDB Endow, vol. 6, no. 14, pp. 2002-2013, 2013.
    [7] J. Zhang, X. Jiang, W.-S. Ku, and X. Qin, "Efficient Parallel Skyline Evaluation Using MapReduce," IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 7, pp. 1996-2009, 2016.
    [8] A. Vlachou, C. Doulkeridis, and Y. Kotidis, Angle-based space partitioning for efficient parallel skyline computation. 2008, pp. 227-238.
    [9] Y. Park, J.-K. Min, and K. Shim, "Efficient Processing of Skyline Queries Using MapReduce," IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 5, pp. 1031-1044, 2017.
    [10] S. Borzsony, D. Kossmann, and K. Stocker, "The Skyline operator," in Proceedings 17th International Conference on Data Engineering, 2001, pp. 421-430.

    QR CODE