簡易檢索 / 詳目顯示

研究生: 董紜碩
Yun-Shuo Dung
論文名稱: 平行Top-k天際線群組查詢演算法之研究
A Study of Parallel Processing of Top-k G-Skyline Queries
指導教授: 陳維美
Wei-Mei Chen
口試委員: 林淵翔
Yuan-Hsiang Lin
林昌鴻
Chang Hong Lin
沈中安
Chung-An Shen
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 46
中文關鍵詞: 天際線群組演算法平行前k
外文關鍵詞: G-Skyline, Group-based skyline
相關次數: 點閱:134下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 天際線演算法是一種應用於資料庫中比較單筆資料優劣的演算法,在實務上已經被廣泛應用。若我們需組合多種因素綜合評估,傳統的天際線演算法即無法達成,因此衍生了以群組考量支配關係的天際線群組運算。在大資料量中挑選天際線答案時,會發生天際線數量非常多的狀況,而這種狀況在天際線群組中更加嚴重,因此產生了top-$k$問題,在天際線問題中使用天際線資料所支配的資料數量作為分數,並篩選出在資料集中支配最多數量的天際線點;在天際線群組問題中,我們以群組中的各點所支配的資料進行聯集後,計算總支配資料數量作為分數。在過去的top-$k$天際線群組研究中,因為需要計算出候選集及候選集中所有支配的點,導致計算量非常高。本論文提出了建構R-tree找出候選集中的點與Minimum Bounding Rectangle(MBR)的支配關係,藉此簡化計算支配關係的次數,並且修改R-tree的資料結構使R-tree同時具備complete tree的特性使得在R-tree的traverse上可以利用GPU加速。實驗結果顯示,本論文能夠有效提高在資料量大的效率與其他文獻提出的方法相比在空間、速度上皆有顯著提升。


    Skyline query algorithm is finding all skyline points for a given dataset, and it has been widely used in practice.
    Group-based skyline (G-Skyline) query returns the optimal groups not dominated by any other group of equal size.
    Those result would be numerous when selecting a skyline in a large dataset, and even more in the G-Skyline, so a top-$k$ problem occurred.
    In top-$k$ skyline, the number of dominated points was used as a score; however, the number of the dominated points in union of the group points was used in top-$k$ G-Skyline.
    Previous research on top-$k$ G-Skyline showed that the amount of computation is significant due to the calculation of the group and all the dominated points in the group set.
    In this research, we proposed a method by building a R-tree and finding out the dominance relationship between group point and Minimum Bounding Rectangle(MBR); therefore, the generating process is simplified.
    The data structure of the R-tree was modified and given the characteristics of the complete Tree. So that the traversal can accelerate by GPU.
    The experimental results showed that this method can effectively improve performance, especially when dealing with massive datasets. The space and execution time are significantly improved when compared with the previous proposed algorithms.

    緒論 文獻探討 問題描述 研究方法 實驗結果與分析 結論

    [1] NVIDIA, “Nvidia-ampere-ga102-gpu-architeture,” 2021.
    [2] A. Guttman, “R-trees: A dynamic index structure for spatial searching,” in Proceed-
    ings of the 1984 ACM SIGMOD international conference on Management of data,
    pp. 47–57, 1984.
    [3] S. Borzsony, D. Kossmann, and K. Stocker, “The skyline operator,” in Proceedings
    17th international conference on data engineering, pp. 421–430, IEEE, 2001.
    [4] M. Amiruzzaman and S. Jamonnak, “Multi-dimensional skyline query to find best
    shopping mall for customers,” in 2020 6th Conference on Data Science and Machine
    Learning Applications, pp. 71–76, 2020.
    [5] Z.-D. Bai, H.-K. Hwang, W.-Q. Liang, and T.-H. Tsai, “Limit theorems for the num-
    ber of maxima in random samples from planar regions,” Electronic Journal of Prob-
    ability, vol. 6, pp. 1–41, 2001.
    [6] W.-T. Balke, U. Güntzer, and J. X. Zheng, “Efficient distributed skylining for web
    information systems,” in International Conference on Extending Database Technol-
    ogy, pp. 256–273, 2004.
    [7] I. Keles and K. Hose, “Skyline queries over knowledge graphs,” in International
    Semantic Web Conference, pp. 293–310, 2019.
    [8] X. Lin, X. Jianliang, and H. Haibo, “Authentication of location-based skyline
    queries,” in Proceedings of the 20th ACM international conference on Information
    and knowledge management, pp. 1583–1588, 2011.
    [9] J. Liu, L. Xiong, J. Pei, J. Luo, and H. Zhang, “Finding pareto optimal groups:
    Group-based skyline,” Proc. VLDB Endow., vol. 8, p. 2086–2097, sep 2015.
    [10] Z. Yang, X. Zhou, K. Li, G. Xiao, Y. Gao, and K. Li, “Efficient processing of top k
    group skyline queries,” Knowledge-Based Systems, vol. 182, p. 104795, 2019.
    [11] H.-S. Kung, F. Luccio, and F. P. Preparata, “On finding the maxima of a set of vec-
    tors,” Journal of the ACM (JACM), vol. 22, no. 4, pp. 469–476, 1975.
    [12] S. Chaudhuri and L. Gravano, “Evaluating top-k selection queries,” in Vldb, vol. 99,
    pp. 397–410, 1999.
    [13] D. Papadias, Y. Tao, G. Fu, and B. Seeger, “Progressive skyline computation in
    database systems,” ACM Trans. Database Syst., vol. 30, p. 41–82, mar 2005.
    [14] B. Song, A. Liu, and L. Ding, “Efficient top-k skyline computation in mapreduce,” in
    2015 12th Web Information System and Application Conference (WISA), pp. 67–70,
    2015.
    [15] X. Feng, Y. Gao, T. Jiang, L. Chen, X. Miao, and Q. Liu, “Parallel k-skyband
    computation on multicore architecture,” in Web Technologies and Applications
    (Y. Ishikawa, J. Li, W. Wang, R. Zhang, and W. Zhang, eds.), (Berlin, Heidelberg),
    pp. 827–837, Springer Berlin Heidelberg, 2013.
    [16] W.-M. Chen and G.-H. Chen, “Divide-and-conquer recurrences associated with gen-
    eralized heaps, optimal merge, and related structures,” Theoretical computer science,
    vol. 292, no. 3, pp. 667–677, 2003.
    [17] A. Guttman, “R-trees: A dynamic index structure for spatial searching,” in Proceed-
    ings of the 1984 ACM SIGMOD International Conference on Management of Data,
    SIGMOD ’84, (New York, NY, USA), p. 47–57, Association for Computing Ma-
    chinery, 1984.

    無法下載圖示 全文公開日期 2024/08/31 (校內網路)
    全文公開日期 2027/08/31 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE