簡易檢索 / 詳目顯示

研究生: 彭弈文
Yi-Wen Peng
論文名稱: CPU-GPU 異質運算架構之平行演算法設計與研究
Parallel Algorithm Design for CPU-GPU Heterogeneous Architectures
指導教授: 陳維美
Wei-Mei Chen
口試委員: 陳維美
Wei-Mei Chen
呂政修
Jenq-Shiou Leu
陳省隆
Hsing-Lung Chen
阮聖彰
Shanq-Jang Ruan
謝孫源
Sun-Yuan Hsieh
張貴雲
Guey-Yun Chang
學位類別: 博士
Doctor
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 88
中文關鍵詞: 平行演算法GPGPU天際線查詢極大團列舉
外文關鍵詞: parallel algorithm, GPGPU, skyline query, maximal clique enumeration
相關次數: 點閱:324下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 使用 GPU 執行通用運算 (GPGPU) 是近來熱門的研究議題,GPU 隨著技術的進步 也可以執行過去只由 CPU 負責的運算工作,常見得應用包括生物序列比對、語音 辨識、分子模型、氣象模擬等。透過適當的分配,使 CPU 與 GPU 個別執行適合 的工作,藉此進一步提升系統整體的運算效率,這種配置也稱為 CPU-GPU 異質 運算架構,目的在於充分利用 CPU 與 GPU 的硬體特性達到更高效的平行運算。
    在大數據時代,海量的資料透過資料挖掘與機器學習的演算法可以提供有價 值的資訊,然而大數據是從各種人事物每天的互動資訊所組成,為探索大數據所 提供的資訊,我們必須能有效處理高維度資料點以及複雜的關係網路,因此 GPU 相當適合用以加速各種分析大數據之演算法。在本研究中,我們使用兩種應用來 探討如合設計利於兩種不同性質資料的 GPU 演算法。
    關於高維度資料點,我們以 k-支配天際線查詢為應用主題。k-支配天際線查 詢功能為提取資料庫中較有代表性之資料點,在多準則決策分析的應用中經常被 使用。k-支配天際線查詢的主要挑戰在於運算量會隨著資料維度上升而呈現指數 成長的趨勢,在本篇研究中,我們著重於切割與分配 k-支配天際線查詢地運算, 使其可以有效的於 GPU 上平行執行。關於複雜關係網路,我們以極大團列舉問題 為主題。極大團是複雜關係網路中的特徵,經常在社群網絡分析與生物資訊分析 中所使用,這類資料一般都非常的龐大,因此我們需要有效處分析極大複雜關係 網路的演算法,在本篇研究中,我們提出加速列舉極大團之 GPU 演算法,同時探 討不規則運算於 GPU 上之議題。


    Recently, general-purpose computing on graphics processing units (GPGPU) attracts a great deal of attention. As the technology advances, modern GPUs are used to accelerate general-purpose applications, such as DNA sequence alignment, voice recognition, traffic simulation, thermal simulation, etc. By combining the best features of CPUs and GPUs, we can achieve even further computational gains. This paradigm, known as heterogeneous computing, aims to fully utilize both CPU and GPU architectures to execute a wide range of applications efficiently.
    In the era of big data, a large amount of data is processed by data mining and machine learning algorithms to obtain valuable information. Big data is collected day to day from many different resources and services, and huge quantities of data are produced every day by and about people, things, and their interactions. To explore cutting-edge topics of big data, we have to deal with high-dimensional data and complex graphs. Hence, GPUs are used to accelerate the algorithms for data analysis. In this thesis, we use two applications to investigate the design of GPU-accelerated algorithms and examine the performance for high-dimensional data and complex graphs.
    For high-dimensional data, we use k-dominant skyline queries to examine the per- formance of GPUs. The k-dominant skyline queries retrieve the preference points from databases, which is essential for many applications involving multi-criteria analysis. The challenge for k-dominant skyline queries is that the computational cost grows rapidly as the number of dimensions grows. In this thesis, we focus on partitioning and balancing the workload of k-dominant skyline queries. For complex graphs, maximal clique enu- meration (MCE) is used to examine the performance of the GPU-accelerated algorithm. Maximal cliques are useful in many applications, such as social graphs analysis and bioin- formatics analysis. Because these applications commonly deal with the situations involved massive data, it is crucial to devise an efficient algorithm to solve MCE problem for large datasets. In this thesis, we accelerate the maximal clique enumeration using GPU and fo- cus on handling the irregular behaviors.

    RecommendationLetter................................ i ApprovalLetter.................................... ii AbstractinChinese .................................. iii AbstractinEnglish .................................. iv Acknowledgements.................................. v Contents........................................ vi ListofFigures..................................... ix ListofTables ..................................... xi 1 Introduction.................................... 1 2 Relatedwork ................................... 5 2.1 GPGPU................................... 5 2.2 CUDA.................................... 6 2.3 Skylinequeriesandk-dominantskylinequeries . . . . . . . . . . . . . . 8 2.4 Maximalcliqueenumeration ........................ 10 2.4.1 Bron-Kerboschalgorithm ..................... 11 2.4.2 Bron-Kerboschalgorithmwithapivot . . . . . . . . . . . . . . . 12 2.4.3 Bron-Kerbosch algorithm with degeneracy . . . . . . . . . . . . 14 2.4.4 Parallel variant of Bron-Kerbosch algorithm . . . . . . . . . . . . 16 3 Parallelk-dominantskylinequeriesusingGPUs . . . . . . . . . . . . . . . . . 18 3.1 Problemdescription............................. 18 3.2 Datarepresentation ............................. 21 3.2.1 Characteristicbitmap........................ 22 3.2.2 Verificationofdominancerelation................. 23 3.2.3 Weightofcharacteristicbitmap .................. 26 3.3 Proposedalgorithm............................. 27 3.3.1 Skylinequeries........................... 28 3.3.2 k-dominantskylinequeries..................... 29 3.4 Experiments................................. 33 3.4.1 Experimentsonskylinequeries .................. 33 3.4.2 Experiments on k-dominant skyline queries . . . . . . . . . . . . 36 4 Accelerating the Bron-Kerbosch algorithm usingGPUs .................................... 40 4.1 Problemdescription............................. 40 4.2 ParallelMCEalgorithm........................... 42 4.2.1 ParallelBron-Kerboschalgorithm ................. 42 4.2.2 Datastructures ........................... 43 4.2.3 GPU-basedBron-Kerboschalgorithm . . . . . . . . . . . . . . . 45 4.2.4 Maximalcliqueexamination.................... 46 4.2.5 Pivotselection ........................... 47 4.2.6 Cliqueexpansion.......................... 48 4.3 Experiments................................. 50 4.3.1 Socialgraph ............................ 54 4.3.2 Densegraph ............................ 54 4.3.3 Synthesisgraph........................... 58 4.3.4 Percentageofexecutiontimebystage . . . . . . . . . . . . . . . 58 4.3.5 Efficiency.............................. 61 5 Conclusionsandfutureresearches ........................ 65 5.1 Conclusions................................. 65 5.2 Futureresearches .............................. 67 5.2.1 High-dimensionaldataest...................... 67 5.2.2 Large-scalecomplexgrahps .................... 67 References....................................... 68 PublicationList .................................... 74 LetterofAuthority .................................. 75

    [1] “Green500.” [Online]. Available: https://www.top500.org/green500/lists/2017/11/
    [2] “Thekoblenznetworkcollection–KONECT,”http://konect.uni-koblenz.de/networks,Oct.2016.
    [3] “Top500,” 2017. [Online]. Available: https://www.top500.org/lists/2017/11/
    [4] F. N. Afrati, P. Koutris, D. Suciu, and J. D. Ullman, “Parallel skyline queries,” Theory of Computing Systems, vol. 57, no. 4, pp. 1008–1037, 2015.
    [5] E.A.Akkoyunlu,“Theenumerationofmaximalcliquesoflargegraphs,”SIAMJournalonComputing, vol. 2, no. 1, pp. 1–6, 1973.
    [6] A.Awasthi,A.Bhattacharya,S.Gupta,andU.K.Singh,“k-dominantskylinejoinqueries:extending the join paradigm to k-dominant skylines,” in 2017 IEEE 33rd International Conference on Data Engineering, 2017, pp. 99–102.
    [7] Z.-D.Bai,C.-C.Chao,H.-K.Hwang,andW.-Q.Liang,“Onthevarianceofthenumberofmaximain random vectors and its applications,” The Annals of Applied Probability, vol. 8, no. 3, pp. 886–895, 1998.
    [8] I. Bartolini, P. Ciaccia, and M. Patella, “SaLSa: computing the skyline without scanning the whole sky,” in Proceedings of the 15th ACM International Conference on Information and Knowledge Man- agement, 2006, pp. 405–414.
    [9] N.Beckmann,H.-P.Kriegel,R.Schneider,andB.Seeger,“TheR∗-tree:anefficientandrobustaccess method for points and rectangles,” in Proceedings of the 1990 ACM SIGMOD International Confer- ence on Management of Data, 1990, pp. 322–331.
    [10] N. Bell and J. Hoberock, “Thrust: a productivity-oriented library for cuda,” GPU computing gems Jade edition, vol. 2, pp. 359–371, 2011.
    [11] H.R.Bernard,P.D.Killworth,andL.Sailer,“InformantaccuracyinsocialnetworkdataIV:Acom- parison of clique-level structure in behavioral and cognitive network data,” Social Networks, vol. 2, no. 3, pp. 191–218, 1979.
    [12] N. Berry, T. Ko, T. Moy, J. Smrcka, J. Turnley, and B. Wu, “Emergent clique formation in terrorist recruitment,” in The AAAI-04 Workshop on Agent Organizations: Theory and Practice, 2004.
    [13] K.S.Bøgh,I.Assent,andM.Magnani,“EfficientGPU-basedskylinecomputation,”inProceedings of the Ninth International Workshop on Data Management on New Hardware, 2013, pp. 5:1–5:6.
    [14] K. S. Bøgh, S. Chester, and I. Assent, “SkyAlign: a portable, work-efficient skyline algorithm for multicore and gpu architectures,” The VLDB Journal, vol. 25, no. 6, pp. 817–841, 2016.
    [15] V. Boginski, S. Butenko, and P. M. Pardalos, “Statistical analysis of financial networks,” Computa- tional statistics & data analysis, vol. 48, no. 2, pp. 431–443, 2005.
    [16] S.Borzsony,D.Kossmann,andK.Stocker,“Theskylineoperator,”inProceedings17thInternational Conference on Data Engineering, 2001, pp. 421–430.
    [17] A.Branover,D.Foley,andM.Steinman,“AMDfusionAPU:Llano,”IEEEMicro,vol.32,no.2,pp. 28–37, 2012.
    [18] C. Bron and J. Kerbosch, “Algorithm 457: finding all cliques of an undirected graph,” Communica- tions of the ACM, vol. 16, no. 9, pp. 575–577, 1973.
    [19] M. Burtscher, R. Nasre, and K. Pingali, “A quantitative study of irregular programs on GPUs,” in Workload Characterization (IISWC), 2012 IEEE International Symposium on. IEEE, 2012, pp. 141– 151.
    [20] F. Cazals and C. Karande, “A note on the problem of reporting maximal cliques,” Theoretical Com- puter Science, vol. 407, no. 1-3, pp. 564–568, 2008.
    [21] C.-Y. Chan, H. V. Jagadish, K.-L. Tan, A. K. H. Tung, and Z. Zhang, “Finding k-dominant skylines in high dimensional space,” in Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, 2006, pp. 503–514.
    [22] W.-M. Chen, H.-K. Hwang, and T.-H. Tsai, “Maxima-finding algorithms for multidimensional sam- ples: a two-phase approach,” Computational Geometry, vol. 45, no. 1–2, pp. 33–53, 2012.
    [23] W.-M. Chen, H.-K. Hwang, and T.-H. Tsai, “Efficient maxima-finding algorithms for random planar samples.” Discrete Mathematics & Theoretical Computer Science, vol. 6, no. 1, pp. 107–122, 2003.
    [24] S.Chester,D.Šidlauskas,I.Assent,andK.S.Bøgh,“Scalableparallelizationofskylinecomputation for multi-core processors,” in 2015 IEEE 31st International Conference on Data Engineering, 2015, pp. 1083–1094.
    [25] N.ChibaandT.Nishizeki,“Arboricityandsubgraphlistingalgorithms,”SIAMJournalonComputing, vol. 14, no. 1, pp. 210–223, 1985.
    [26] W.Choi,L.Liu,andB.Yu,“Multi-criteriadecisionmakingwithskylinecomputation,”in2012IEEE 13th International Conference on Information Reuse Integration, 2012, pp. 316–323.
    [27] J.Chomicki,P.Godfrey,J.Gryz,andD.Liang,“Skylinewithpresorting:theoryandoptimizations,”in Intelligent Information Processing and Web Mining. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005, pp. 595–604.
    [28] C. A. C. Coello, G. B. Lamont, D. A. Van Veldhuizen et al., Evolutionary algorithms for solving multi-objective problems. Springer, 2007, vol. 5.
    [29] G. Creamer, R. Rowe, S. Hershkop, and S. J. Stolfo, “Segmentation and automated social hierar- chy detection through email network analysis,” in Advances in web mining and web usage analysis. Springer, 2009, pp. 40–58.
    [30] S. Damaraju, V. George, S. Jahagirdar, T. Khondker, R. Milstrey, S. Sarkar, S. Siers, I. Stolero, and A. Subbiah, “A 22nm ia multi-CPU and GPU system-on-chip,” in Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2012 IEEE International. IEEE, 2012, pp. 56–57.
    [31] K. Deb, K. Sindhya, and J. Hakanen, “Multi-objective optimization,” in Decision Sciences: Theory and Practice. CRC Press, 2016, pp. 145–184.
    [32] L. G. Dong and X. W. Cui, “Finding k-dominant skyline for combined dataset,” in Measurement Technology and its Application III, vol. 568, 2014, pp. 1534–1538.
    [33] N. Du, B. Wu, L. Xu, B. Wang, and P. Xin, “Parallel algorithm for enumerating maximal cliques in complex network,” in Mining Complex Data. Springer, 2009, pp. 207–221.
    [34] M. Ehrgott, Multicriteria optimization. Springer Science & Business Media, 2006.
    [35] D.Eppstein,“Allmaximalindependentsetsanddynamicdominanceforsparsegraphs,”ACMTrans-
    actions on Algorithms, vol. 5, no. 4, p. 38, 2009.
    [36] D.Eppstein,M.Löffler,andD.Strash,“Listingallmaximalcliquesinlargesparsereal-worldgraphs,”
    Journal of Experimental Algorithmics, vol. 18, pp. 3–1, 2013.
    [37] P.Godfrey,R.Shipley,andJ.Gryz,“Maximalvectorcomputationinlargedatasets,”inProceedings
    of the 31st International Conference on Very Large Data Bases, 2005, pp. 229–240.
    [38] K. Hose and A. Vlachou, “A survey of skyline processing in highly distributed environments,” The
    VLDB Journal, vol. 21, no. 3, pp. 359–384, 2012.
    [39] H.-K.Hwang,T.-H.Tsai,andW.-M.Chen,“Thresholdphenomenaink-dominantskylinesofrandom
    samples,” SIAM Journal on Computing, vol. 42, no. 2, pp. 405–441, 2013.
    [40] J. Jenkins, I. Arkatkar, J. D. Owens, A. Choudhary, and N. F. Samatova, “Lessons learned from ex- ploring the backtracking paradigm on the GPU,” Euro-Par 2011 Parallel Processing, pp. 425–437, 2011.
    [41] K. L. Jensen, M. P. Styczynski, I. Rigoutsos, and G. N. Stephanopoulos, “A generic motif discovery algorithm for sequential data,” Bioinformatics, vol. 22, no. 1, pp. 21–28, 2006.
    [42] H. Johnston, “Cliques of a graph-variations on the bron-kerbosch algorithm,” International Journal of Parallel Programming, vol. 5, no. 3, pp. 209–238, 1976.
    [43] J.KimandC.Batten,“Acceleratingirregularalgorithmsongpgpususingfine-grainhardwarework- lists,” in Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE Computer Society, 2014, pp. 75–87.
    [44] I. Koch, “Enumerating all connected maximal common subgraphs in two graphs,” Theoretical Com- puter Science, vol. 250, no. 1-2, pp. 1–30, 2001.
    [45] M. Kontaki, A. N. Papadopoulos, and Y. Manolopoulos, “Continuous k-dominant skyline compu- tation on multidimensional data streams,” in Proceedings of the 2008 ACM Symposium on Applied Computing, 2008, pp. 956–960.
    [46] D. Kossmann, F. Ramsak, and S. Rost, “Shooting stars in the sky: an online algorithm for skyline queries,” in Proceedings of the 28th International Conference on Very Large Data Bases, 2002, pp. 275–286.
    [47] H.T.Kung,F.Luccio,andF.P.Preparata,“Onfindingthemaximaofasetofvectors,”Journalofthe ACM, vol. 22, no. 4, pp. 469–476, 1975.
    [48] E. L. Lawler, J. K. Lenstra, and A. Rinnooy Kan, “Generating all maximal independent sets: Np- hardness and polynomial-time algorithms,” SIAM Journal on Computing, vol. 9, no. 3, pp. 558–565, 1980.
    [49] J.LeeandS.-w.Hwang,“BSkyTree:scalableskylinecomputationusingabalancedpivotselection,” in Proceedings of the 13th International Conference on Extending Database Technology, 2010, pp. 195–206.
    [50] J. Lee and S.-w. Hwang, “Skytree: scalable skyline computation for sensor data,” in Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data, 2009, pp. 114–123.
    [51] J.LeeandS.wonHwang,“Scalableskylinecomputationusingabalancedpivotselectiontechnique,” Information Systems, vol. 39, pp. 1–21, 2014.
    [52] K. C. Lee, W.-C. Lee, B. Zheng, H. Li, and Y. Tian, “Z-SKY: an efficient skyline query processing framework based on Z-order,” The VLDB Journal, vol. 19, no. 3, pp. 333–362, 2010.
    [53] J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http://snap. stanford.edu/data, Jun. 2014.
    [54] E. Loukakis, “A new backtracking algorithm for generating the family of maximal independent sets of a graph,” Computers & Mathematics with Applications, vol. 9, no. 4, pp. 583–589, 1983.
    [55] L.Lu,Y.Gu,andR.Grossman,“dmaximalcliques:Adistributedalgorithmforenumeratingallmax- imal cliques and maximal clique distribution,” in IEEE International Conference on Data Mining Workshops. IEEE, 2010, pp. 1320–1327.
    [56] Z.Ma,K.Zhang,S.Wang,andC.Yu,“Adouble-index-basedk-dominantskylinealgorithmforincom- plete data stream,” in 2013 IEEE 4th International Conference on Software Engineering and Service Science, 2013, pp. 750–753.
    [57] K.MakinoandT.Uno,“Newalgorithmsforenumeratingallmaximalcliques,”inScandinavianWork- shop on Algorithm Theory. Springer, 2004, pp. 260–272.
    [58] X.Miao,Y.Gao,G.Chen,andT.Zhang,“k-dominantskylinequeriesonincompletedata,”Informa- tion Sciences, vol. 367, pp. 990–1011, 2016.
    [59] S.MittalandJ.S.Vetter,“AsurveyofCPU-GPUheterogeneouscomputingtechniques,”ACMCom- puting Surveys (CSUR), vol. 47, no. 4, p. 69, 2015.
    [60] J. W. Moon and L. Moser, “On cliques in graphs,” Israel journal of Mathematics, vol. 3, no. 1, pp. 23–28, 1965.
    [61] M. Morse, J. M. Patel, and H. V. Jagadish, “Efficient skyline computation over low-cardinality do- mains,” in Proceedings of the 33rd International Conference on Very Large Data Bases, 2007, pp. 267–278.
    [62] K. Mullesgaard, J. L. Pedersen, H. Lu, and Y. Zhou, “Efficient skyline computation in MapReduce,” in 17th International Conference on Extending Database Technology, 2014, pp. 37–48.
    [63] K.A.Naudé,“Refinedpivotselectionformaximalcliqueenumerationingraphs,”TheoreticalCom- puter Science, vol. 613, pp. 28–37, 2016.
    [64] D.Papadias,Y.Tao,G.Fu,andB.Seeger,“Anoptimalandprogressivealgorithmforskylinequeries,” in Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, 2003, pp. 467–478.
    [65] Y. Park, J.-K. Min, and K. Shim, “Parallel computation of skyline and reverse skyline queries using MapReduce,” The Proceedings of the VLDB Endowment, vol. 6, no. 14, pp. 2002–2013, 2013.
    [66] M.C.Schmidt,N.F.Samatova,K.Thomas,andB.-H.Park,“Ascalable,parallelalgorithmformaxi- mal clique enumeration,” Journal of Parallel and Distributed Computing, vol. 69, no. 4, pp. 417–428, 2009.
    [67] M. A. Siddique, T. Hao, and Y. Morimoto, “k-dominant skyline query computation in MapReduce environment,” IEICE Transactions on Information and Systems, vol. 98, no. 5, pp. 1027–1034, 2015.
    [68] V.Stix,“Findingallmaximalcliquesindynamicgraphs,”ComputationalOptimizationandapplica- tions, vol. 27, no. 2, pp. 173–186, 2004.
    [69] K.-L. Tan, P.-K. Eng, and B. C. Ooi, “Efficient progressive skyline computation,” in Proceedings of the 27th International Conference on Very Large Data Bases, 2001, pp. 301–310.
    [70] E.Tomita,A.Tanaka,andH.Takahashi,“Theworst-casetimecomplexityforgeneratingallmaximal cliques and computational experiments,” Theoretical Computer Science, vol. 363, no. 1, pp. 28–42, 2006.
    [71] S.Tsukiyama,M.Ide,H.Ariyoshi,andI.Shirakawa,“Anewalgorithmforgeneratingallthemaximal independent sets,” SIAM Journal on Computing, vol. 6, no. 3, pp. 505–517, 1977.
    [72] J. Wang, N. Rubin, A. Sidelnik, and S. Yalamanchili, “Dynamic thread block launch: A lightweight execution mechanism to support irregular applications on gpus,” ACM SIGARCH Computer Architec- ture News, vol. 43, no. 3, pp. 528–540, 2016.
    [73] S. Wasserman and K. Faust, Social network analysis: Methods and applications. Cambridge univer- sity press, 1994, vol. 8.
    [74] B.Wu,S.Yang,H.Zhao,andB.Wang,“Adistributedalgorithmtoenumerateallmaximalcliquesin mapreduce,” in The Fourth International Conference on Frontier of Computer Science and Technol-
    ogy. IEEE, 2009, pp. 45–51.
    [75] Y.Xu,J.Cheng,andA.W.-C.Fu,“Distributedmaximalcliquecomputationandmanagement,”IEEE Transactions on Services Computing, vol. 9, no. 1, pp. 110–122, 2016.
    [76] B. Yu, W. Choi, and L. Liu, “Exploring correlation for fast skyline computation,” The Journal of Supercomputing, published online.
    [77] B.Zhang,B.-H.Park,T.Karpinets,andN.F.Samatova,“Frompull-downdatatoproteininteraction networks and complexes with biological relevance,” Bioinformatics, vol. 24, no. 7, pp. 979–986, 2008.
    [78] J. Zhang, X. Jiang, W.-S. Ku, and X. Qin, “Efficient parallel skyline evaluation using MapReduce,” IEEE Transactions on Parallel and Distributed Systems, vol. 27, no. 7, pp. 1996–2009, 2016.
    [79] S. Zhang, N. Mamoulis, and D. W. Cheung, “Scalable skyline computation using object-based space partitioning,” in Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, 2009, pp. 483–494.

    無法下載圖示 全文公開日期 2023/01/30 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE