研究生: |
吳宗翰 Tsung-Han Wu |
---|---|
論文名稱: |
列舉極大完全子圖問題於GPU運算架構之研究 A Study on the Problem of Maximal Clique Enumeration on GPU |
指導教授: |
陳維美
Wei-Mei Chen |
口試委員: |
林昌鴻
Chang-Hong Lin 林敬舜 Ching-Shun Lin 陳維美 Wei-Mei Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 54 |
中文關鍵詞: | 列舉極大完全子圖 、平行運算 、圖形處理器 |
外文關鍵詞: | maximal clique enumeration, GPU, GPGPU, parallel |
相關次數: | 點閱:246 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
列舉極大完全子圖是一種圖論的問題,可以在大圖內找出所有完整子圖,適合用來分析網路結構、圖論分群、社交群的偵測,而目前最常用來解決此問題的演算法是Bron-Kerbosch演算法,但是該演算法大部分研究均著重運行在CPU上單執行緒的改良,對於平行版本的實現改良相對來說少之又少。近年來,由於GPGPU平行運算的發展盛行,對於需求大量計算的演算法加速運算效果顯著,但是在GPU的運作方式與CPU上的相差甚遠。本論文在這邊提出一個方法,我們稱為GPU-based MCE演算法,使用多重旗標標記法,將Bron-Kerbosch演算法的3組集合整合成一組資料集合來表示,並將Bron-Kerbosch演算法的集合計算部分以一個執行緒對應一個點的方式進行,同時為了達成GPU每個執行緒的計算工作量平衡我們使用二元搜尋法輔助集合計算。除此之外我們還善用CUDA的Pascal最新強化atomic硬體計算架構功能使用atomicAdd來統計pivoit分數,讓此演算法在GPU上的平行運算盡可能達到使用GPU的最佳效果。
Maximal Clique Enumeration is a graph theory problem, which can find all complete subgraphs in a graph, is suitable for graph analysis such as network structure, graph clustering and community group detection, etc. Bron-Kerbosch algorithm is the most popular algorithm to solve this problem, but most of the studies about the Bron-Kerbosch algorithm were in CPU single core execution implementation. It is only a few study on the parallel execution implementation version. Moreover, the general-purpose GPU (GPGPU) computation development is prevalent, which is more efficient about massive computation. In this paper, we proposed a method called GPU-based MCE, which used multiple flags to combine Bron-Kerbosch algorithm three data sets into one data set. Set-manipulation operations in Bron-Kerbosch algorithm were executed based on node-thread one-to-one mappings. We used binary search for the set-manipulation operations to balance the workload in GPU. In addition, we used enhanced atomic operations in the latest NVIDIA CUDA architecture, Pascal, to sum up the pivot score, making the proposed algorithm can get efficient results when executing in parallel using the GPU.
[1] “The koblenz network collection – KONECT,” http://konect.uni-koblenz.de/networks, Oct. 2016.
[2] F. N. Abu-Khzam, N. E. Baldwin, M. A. Langston, and N. F. Samatova, “On the relative efficiency of maximal clique enumeration algorithms, with applications to highthroughput computational biology,” in International Conference on Research Trends in Science and Technology, 2005.
[3] E. A. Akkoyunlu, “The enumeration of maximal cliques of large graphs,” SIAM Journal on Computing, vol. 2, no. 1, pp. 1–6, 1973.
[4] H. R. Bernard, P. D. Killworth, and L. Sailer, “Informant accuracy in social network data iv: A comparison of clique-level structure in behavioral and cognitive network data,” Social Networks, vol. 2, no. 3, pp. 191–218, 1979.
[5] N. Berry, T. Ko, T. Moy, J. Smrcka, J. Turnley, and B. Wu, “Emergent clique formation in terrorist recruitment,” in The AAAI-04 Workshop on Agent Organizations: Theory and Practice, 2004.
[6] V. Boginski, S. Butenko, and P. M. Pardalos, “Statistical analysis of financial networks,” Computational statistics & data analysis, vol. 48, no. 2, pp. 431–443, 2005.
[7] C. Bron and J. Kerbosch, “Algorithm 457: finding all cliques of an undirected graph,” Communications of the ACM, vol. 16, no. 9, pp. 575–577, 1973.
[8] F. Cazals and C. Karande, “A note on the problem of reporting maximal cliques,” Theoretical Computer Science, vol. 407, no. 1-3, pp. 564–568, 2008.
[9] Y. Chen and G. M. Crippen, “A novel approach to structural alignment using realistic structural and environmental information,” Protein Science, vol. 14, no. 12, pp. 2935– 2946, 2005.
[10] J. Cheng, L. Zhu, Y. Ke, and S. Chu, “Fast algorithms for maximal clique enumeration with limited memory,” in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012, pp. 1240–1248.
[11] N. Chiba and T. Nishizeki, “Arboricity and subgraph listing algorithms,” SIAM Journal on Computing, vol. 14, no. 1, pp. 210–223, 1985.
[12] G. Creamer, R. Rowe, S. Hershkop, and S. J. Stolfo, “Segmentation and automated social hierarchy detection through email network analysis,” in Advances in web mining and web usage analysis. Springer, 2009, pp. 40–58.
[13] N. Du, B. Wu, L. Xu, B. Wang, and P. Xin, “Parallel algorithm for enumerating maximal cliques in complex network,” in Mining Complex Data. Springer, 2009, pp. 207–221.
[14] D. Eppstein, M. Löffler, and D. Strash, “Listing all maximal cliques in sparse graphs in near-optimal time,” in International Symposium on Algorithms and Computation. Springer, 2010, pp. 403–414.
[15] J. Jenkins, I. Arkatkar, J. D. Owens, A. Choudhary, and N. F. Samatova, “Lessons learned from exploring the backtracking paradigm on the gpu,” Euro-Par 2011 Parallel Processing, pp. 425–437, 2011.
[16] K. L. Jensen, M. P. Styczynski, I. Rigoutsos, and G. N. Stephanopoulos, “A generic motif discovery algorithm for sequential data,” Bioinformatics, vol. 22, no. 1, pp. 21–28, 2006.
[17] I. Koch, “Enumerating all connected maximal common subgraphs in two graphs,” Theoretical Computer Science, vol. 250, no. 1-2, pp. 1–30, 2001.
[18] E. L. Lawler, J. K. Lenstra, and A. Rinnooy Kan, “Generating all maximal independent sets: Np-hardness and polynomial-time algorithms,” SIAM Journal on Computing, vol. 9, no. 3, pp. 558–565, 1980.
[19] J. Leskovec and A. Krevl, “SNAP Datasets: Stanford large network dataset collection,” http://snap.stanford.edu/data, Jun. 2014.
[20] L. Lu, Y. Gu, and R. Grossman, “dmaximalcliques: A distributed algorithm for enumerating all maximal cliques and maximal clique distribution,” in Data Mining Workshops (ICDMW), 2010 IEEE International Conference on. IEEE, 2010, pp. 1320–1327.
[21] K. Makino and T. Uno, “New algorithms for enumerating all maximal cliques,” in Scandinavian Workshop on Algorithm Theory. Springer, 2004, pp. 260–272.
[22] J. W. Moon and L. Moser, “On cliques in graphs,” Israel journal of Mathematics, vol. 3, no. 1, pp. 23–28, 1965.
[23] NVIDIA, “CUDA C programming guide v7.5,” https://images.nvidia.com/content/pdf/tesla/whitepaper/pascal-architecture-whitepaper.pdf, 2017.
[24] NVIDIA, “GP100 pascal whitepaper - nvidia,” http://docs.nvidia.com/pdf/CUDA_C_Programming_Guide.pdf, 2016.
[25] M. C. Schmidt, N. F. Samatova, K. Thomas, and B.-H. Park, “A scalable, parallel algorithm for maximal clique enumeration,” Journal of Parallel and Distributed Computing, vol. 69, no. 4, pp. 417–428, 2009.
[26] V. Stix, “Finding all maximal cliques in dynamic graphs,” Computational Optimization and applications, vol. 27, no. 2, pp. 173–186, 2004.
[27] E. Tomita, A. Tanaka, and H. Takahashi, “The worst-case time complexity for generating all maximal cliques and computational experiments,” Theoretical Computer Science, vol. 363, no. 1, pp. 28–42, 2006.
[28] S. Tsukiyama, M. Ide, H. Ariyoshi, and I. Shirakawa, “A new algorithm for generating all the maximal independent sets,” SIAM Journal on Computing, vol. 6, no. 3, pp. 505–517, 1977.
[29] J. Wang, J. Cheng, and A. W.-C. Fu, “Redundancy-aware maximal cliques,” in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013, pp. 122–130.
[30] S. Wasserman and K. Faust, Social network analysis: Methods and applications. Cambridge university press, 1994, vol. 8.
[31] B. Wu, S. Yang, H. Zhao, and B. Wang, “A distributed algorithm to enumerate all maximal cliques in mapreduce,” in Frontier of Computer Science and Technology, 2009. FCST’09. Fourth International Conference on. IEEE, 2009, pp. 45–51.
[32] Y. Xu, J. Cheng, and A. W.-C. Fu, “Distributed maximal clique computation and management,” IEEE Transactions on Services Computing, vol. 9, no. 1, pp. 110–122, 2016.
[33] B. Zhang, B.-H. Park, T. Karpinets, and N. F. Samatova, “From pull-down data to protein interaction networks and complexes with biological relevance,” Bioinformatics,vol. 24, no. 7, pp. 979–986, 2008.