簡易檢索 / 詳目顯示

研究生: Apicha Lumveerakul
Apicha Lumveerakul
論文名稱: 網路分析中嵌入、聚類和社區檢測技術的比較研究
A Comparative Study of Embedding, Clustering, and Community Detection Techniques in Network Analysis.
指導教授: 楊朝龍
Chao-Lung Yang
口試委員: 林希偉
Shi-Woei Lin
許嘉裕
Chia-Yu Hsu
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2023
畢業學年度: 112
語文別: 英文
論文頁數: 75
外文關鍵詞: community detection, network clustering, network reduction, large-scale network
相關次數: 點閱:54下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

  • In the realm of social network analysis, managing and interpreting large-scale networks pose significant challenges due to their vast size and complexity. This study introduces a comprehensive framework that synergizes advanced network analysis techniques, particularly focusing on Node2Vec for network embedding and HDBSCAN for community detection, alongside traditional network reduction methods and other traditional community detection. This study use the KS-statistic to assess network similarity and structural integrity post-analysis. To ensure that the findings are not biased by specific data sizes or characteristics, the study incorporates 10 datasets of undirected networks, each differing in scale and structural features. Performance indicators for these datasets are represented graphically as curves, and the area under each curve is calculated to provide a comprehensive evaluation metric. Each combination of dataset and algorithm undergoes 50 iterations to minimize the influence of random variation. The findings reveal that the methods of network clustering and network reduction function as independent factors and methods incorporating the induced subgraph random walk generally surpass the performance of the random walk method, regardless of the combination used. Moreover, these methods demonstrate the advantage of saving computational time while trying to maintaining or enhancing performance.

    TABLE OF CONTENTS iv LIST OF FIGURES vi LIST OF TABLES ix CHAPTER 1. INTRODUCTION 1 1.1. Social Network Data and Background 1 1.2. Research Problem-How to Efficiently Reduce the Network 2 1.3. Propose-Combination of SNA Methods 4 1.4. Structure 5 CHAPTER 2. LITERATURE REVIEW 7 2.1. Network Reduction 7 2.2. Community Detection 9 2.3. Measurement of Network 12 CHAPTER 3. METHODOLOGY 14 3.1. Network Measurement. 14 3.1.1. Degree Distribution 14 3.1.2. Clustering Coefficient Distribution 15 3.1.3. K-core Distribution 16 3.1.4. KS-Distance 17 3.2. Community Detection Algorithms 18 3.2.1. Modularity 18 3.2.2. Leiden Algorithmh 19 3.2.3. HDBSCAN 20 3.3. Node2Vec Algorithm 21 3.4. Network Reduction Algorithms 22 3.4.1. Random Walk (RW) 23 3.4.2. Induced Subgraph Random Walk (ISRW) 24 3.5. Overall Framework 25 CHAPTER 4. EXPERIMENT 27 4.1. Data Description 27 4.2. Testing Node2vec and HDBSCAN Reduction Algorithms Exp #1 28 4.3. Comparison Between Different Clustering Algorithms Exp #2 32 4.4. Explore variety of dataset Exp #3 37 4.5. Discussion 43 CHAPTER 5. CONCLUSION 46 REFERENCES 48 APPENDIX 53

    REFERENCES
    [1] S. P. Borgatti, A. Mehra, D. J. Brass, and G. Labianca, "Network analysis in the social sciences," Science, vol. 323, no. 5916, pp. 892-895, 13 Feb 2009.
    [2] O. Serrat, "Social network analysis," in Knowledge solutions: Springer, 2017, pp. 39-43.
    [3] S. Wasserman and K. Faust, "Social network analysis: Methods and applications," 1994.
    [4] D. Torgerson, "Industrialization and assessment: social impact assessment as a social phenomenon," 1980.
    [5] S. Wasserman and K. Faust, Social network analysis: Methods and applications. June 2012: Cambridge University Press, 1994.
    [6] D. Watts, "Networks, dynamics, and the small-world phenomenon," American Journal of sociology, vol. 105, no. 2, pp. 493-527, September 1999.
    [7] D. R. White, J. Owen-Smith, J. Moody, and W. W. Powell, "Networks, fields and organizations: micro-dynamics, scale and cohesive embeddings," Computational mathematical organization theory, vol. 10, no. 1, pp. 95-117, 2004.
    [8] E. L. Kick, L. A. McKinney, S. McDonald, and A. Jorgenson, "A multiple-network analysis of the world system of nations, 1995-1999," Sage handbook of social network analysis, pp. 311-327, 2011.
    [9] A. Quan-Haase and B. Wellman, Computer-mediated community in a high-tech organization (The firm as a collaborative community: reconstructing trust in the knowledge economy). 2006, pp. 281-333.
    [10] J. A. Barnes and F. Harary, "Graph theory in network analysis," Social networks, vol. 5, no. 2, pp. 235-244, June 1983.
    [11] J. L. Moreno, Who shall survive? Foundations of sociometry, group psychotherapy and socio-drama. 1953.
    [12] F. Harary and R. Z. Norman, Graph theory as a mathematical model in social science (no. 2). University of Michigan, Institute for Social Research Ann Arbor, 1953.
    [13] F. Zhou, S. Malher, and H. Toivonen, "Network simplification with minimal loss of connectivity," in 2010 IEEE international conference on data mining, 20 January 2010: IEEE, pp. 659-668.
    [14] H. Oh, "Aggregation of buses for a network reduction," IEEE Transactions on Power Systems, vol. 27, no. 2, pp. 705-712, 09 January 2012.
    [15] D. Zhang, J. Yin, X. Zhu, and C. Zhang, "Network representation learning: A survey," IEEE transactions on Big Data, vol. 6, no. 1, pp. 3-28, 1 March 2018.
    [16] S. M. Ashraf, B. Rathore, and S. Chakrabarti, "Performance analysis of static network reduction methods commonly used in power systems," in 2014 Eighteenth National Power Systems Conference (NPSC), 2014: IEEE, pp. 1-6.
    [17] L. Bo-Seng, "Metaheuristic-based Optimization for Data Sampling of Social Network Clusters," Master, Department of Industrial Engineering and Management, National Taiwan University of Sciece and Technology, 2021.
    [18] A. Grover and J. Leskovec, "node2vec: Scalable feature learning for networks," in Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining, 2016, pp. 855-864.
    [19] L. McInnes, J. Healy, and S. Astels, "hdbscan: Hierarchical density based clustering," J. Open Source Softw., vol. 2, no. 11, p. 205, 2017.
    [20] N. Martin, P. Frasca, and C. Canudas-de-Wit, "Large-scale network reduction towards scale-free structure," IEEE Transactions on Network Science Engineering, vol. 6, no. 4, pp. 711-723, 26 September 2018.
    [21] V. Dubois and C. Bothorel, "Transitive reduction for social network analysis and visualization," in The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05), 17 October 2005: IEEE, pp. 128-131.
    [22] M. Kudelka, Z. Horak, V. Snasel, and A. Abraham, "Social network reduction based on stability," in 2010 International Conference on Computational Aspects of Social Networks, 2010: IEEE, pp. 509-514.
    [23] J. Leskovec and C. Faloutsos, "Sampling from large graphs," in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 20 August 2006, pp. 631-636.
    [24] S. Tsugawa and H. Ohsaki, "Benefits of bias in crawl-based network sampling for identifying key node set," IEEE Access, vol. 8, pp. 75370-75380, 20 April 2020.
    [25] J. Kleinberg, "The small-world phenomenon: An algorithmic perspective," in Proceedings of the thirty-second annual ACM symposium on Theory of computing, 2000, pp. 163-170.
    [26] F. Gehring and P. Halmos, "Graduate Texts in Mathematics," 1977.
    [27] M. Plantié and M. Crampes, "Survey on social community detection," in Social media retrieval: Springer, 2013, pp. 65-85.
    [28] A.-L. Barabási and R. Albert, "Emergence of scaling in random networks," Science, vol. 286, no. 5439, pp. 509-512, 15 October 1999.
    [29] M. Faloutsos, P. Faloutsos, and C. Faloutsos, "On power-law relationships of the internet topology," ACM SIGCOMM computer communication review, vol. 29, no. 4, pp. 251-262, 30 August 1999.
    [30] S. Milgram, "The small world problem," Psychology today, vol. 2, no. 1, pp. 60-67, 1 May 1967.
    [31] R. Albert, H. Jeong, and A.-L. Barabási, "Diameter of the world-wide web," Nature, vol. 401, no. 6749, pp. 130-131, 09 September 1999.
    [32] J. Leskovec and E. Horvitz, "Planetary-scale views on a large instant-messaging network," in Proceedings of the 17th international conference on World Wide Web, 21 April 2008, pp. 915-924.
    [33] F. D. Malliaros and M. Vazirgiannis, "Clustering and community detection in directed networks: A survey," Physics reports, vol. 533, no. 4, pp. 95-142, 30 December 2013.
    [34] S. Fortunato, "Community detection in graphs," Physics reports, vol. 486, no. 3-5, pp. 75-174, 17 November 2010.
    [35] M. Coscia, F. Giannotti, and D. Pedreschi, "A classification for community discovery methods in complex networks," Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 4, no. 5, pp. 512-546, 09 September 2011.
    [36] M. Crampes and M. Plantié, "A unified community detection, visualization and analysis method," Advances in complex systems, vol. 17, no. 01, p. 1450001, 12 March 2014.
    [37] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, "Fast unfolding of communities in large networks," Journal of statistical mechanics: theory experiment, vol. 2008, no. 10, p. P10008, 9 October 2008.
    [38] P. De Meo, E. Ferrara, G. Fiumara, and A. Provetti, "Generalized louvain method for community detection in large networks," in 2011 11th international conference on intelligent systems design and applications, 03 January 2011: IEEE, pp. 88-93.
    [39] V. A. Traag, L. Waltman, and N. J. Van Eck, "From Louvain to Leiden: guaranteeing well-connected communities," Scientific reports, vol. 9, no. 1, pp. 1-12, 26 March 2019.
    [40] A. Boehnlein et al., "Colloquium: Machine learning in nuclear physics," Reviews of Modern Physics, vol. 94, no. 3, p. 031003, 2022.
    [41] M. M. Deza and E. Deza, "Voronoi diagram distances," in Encyclopedia of Distances: Springer, 2013, pp. 339-347.
    [42] Y. Bai, H. Ding, S. Bian, T. Chen, Y. Sun, and W. Wang, "Simgnn: A neural network approach to fast graph similarity computation," in Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 30 January 2019, pp. 384-392.
    [43] R. Albert and A.-L. Barabási, "Statistical mechanics of complex networks," Reviews of modern physics, vol. 74, no. 1, p. 47, 30 January 2002.
    [44] P. N. Krivitsky, M. S. Handcock, A. E. Raftery, and P. D. Hoff, "Representing degree distributions, clustering, and homophily in social networks with latent cluster random effects models," Social networks, vol. 31, no. 3, pp. 204-213, 26 May 2009.
    [45] E. Estrada, "Degree heterogeneity of graphs and networks. I. Interpretation and the “heterogeneity paradox”," Journal of Interdisciplinary Mathematics, vol. 22, no. 4, pp. 503-529, 2019.
    [46] Z. Burda, J. Jurkiewicz, and A. Krzywicki, "Network transitivity and matrix models," Physical Review E, vol. 69, no. 2, p. 026106, 2004.
    [47] S. N. Soffer and A. Vazquez, "Network clustering coefficient without degree-correlation biases," Physical Review E, vol. 71, no. 5, p. 057101, 2005.
    [48] Y.-X. Kong, G.-Y. Shi, R.-J. Wu, and Y.-C. Zhang, "k-core: Theories and applications," Physics Reports, vol. 832, pp. 1-32, 2019.
    [49] M. L. Goldstein, S. A. Morris, and G. G. Yen, "Problems with fitting to the power-law distribution," The European Physical Journal B-Condensed Matter
    Complex Systems, vol. 41, no. 2, pp. 255-258, 18 June 2004.
    [50] M. Ghavipour and M. R. Meybodi, "Irregular cellular learning automata-based algorithm for sampling social networks," Engineering Applications of Artificial Intelligence, vol. 59, pp. 244-259, 14 January 2017.
    [51] D. J. Watts and S. H. Strogatz, "Collective dynamics of ‘small-world’networks," Nature, vol. 393, no. 6684, pp. 440-442, 04 June 1998.
    [52] M. Newman, "Networks: An introduction," Artif. Life, vol. 18, pp. 241-2, 1 April 2012.
    [53] S. Wang, L. Huang, C.-H. Hsu, and F. Yang, "Collaboration reputation for trustworthy Web service selection in social networks," Journal of Computer System Sciences, vol. 82, no. 1, pp. 130-143, 2016.
    [54] S. B. Seidman, "Network structure and minimum degree," Social networks, vol. 5, no. 3, pp. 269-287, September 1983.
    [55] Q. Yu, H. Shao, and Z. Duan, "Research groups of oncology co-authorship network in China," Scientometrics, vol. 89, no. 2, pp. 553-567, 06 August 2011.
    [56] J. Alvarez-Hamelin, L. Dall'Asta, A. Barrat, and A. Vespignani, "Large scale networks fingerprinting and visualization using the k-core decomposition," Advances in neural information processing systems, vol. 18, 05 December 2005.
    [57] K. J. Dooley, S. D. Pathak, T. J. Kull, Z. Wu, J. Johnson, and E. Rabinovich, "Process network modularity, commonality, and greenhouse gas emissions," Journal of Operations Management, vol. 65, no. 2, pp. 93-113, 18 March 2019, doi: 10.1002/joom.1007.
    [58] F. Gehring and P. Halmos, Graph Theory (Graduate Texts in Mathematics).
    [59] Y. Li, Y. Shang, and Y. Yang, "Clustering coefficients of large networks," Information Sciences, vol. 382, pp. 350-358, 2017.
    [60] P. Chunaev, "Community detection in node-attributed social networks: a survey," Computer Science Review, vol. 37, p. 100286, 21 July 2020.
    [61] J. Golbeck, J. M. Grimes, and A. Rogers, "Twitter use by the US Congress," Journal of the American society for information science and technology, vol. 61, no. 8, pp. 1612-1621, 2010.
    [62] H. Yin, A. R. Benson, J. Leskovec, and D. F. Gleich, "Local higher-order graph clustering," in Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, 04 August 2017, pp. 555-564.
    [63] S. Kumar, F. Spezzano, V. Subrahmanian, and C. Faloutsos, "Edge weight prediction in weighted signed networks," in 2016 IEEE 16th International Conference on Data Mining (ICDM), 12 December 2016: IEEE, pp. 221-230.
    [64] J. Leskovec, J. Kleinberg, and C. Faloutsos, "Graph evolution: Densification and shrinking diameters," ACM transactions on Knowledge Discovery from Data, vol. 1, no. 1, pp. 2-es, 01 March 2007.
    [65] J. Leskovec and J. Mcauley, "Learning to discover social circles in ego networks," Advances in neural information processing systems, vol. 25, 2012.
    [66] L. Waltman and N. J. Van Eck, "A smart local moving algorithm for large-scale modularity-based community detection," The European physical journal B, vol. 86, no. 11, pp. 1-14, 13 November 2013.
    [67] Y. Hsu, "Apply Community Detection and Network Reduction for Social Network Simplification,," Master degree, National Taiwan University of science and technology, 2022.

    QR CODE