簡易檢索 / 詳目顯示

研究生: 賴伯昇
Bo-Sheng Lai
論文名稱: 基於啟發式演算法之社群網路分群抽樣最佳化問題
Metaheuristic-based Optimization for Data Sampling of Social Network Clusters
指導教授: 楊朝龍
Chao-Lung Yang
口試委員: 陳怡伶
Yi-Ling Chen
林希偉
Shi-Woei Lin
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 63
中文關鍵詞: 社群網路分析社群網路抽樣圖像指標最佳化基因演算法
外文關鍵詞: Social Network Analysis, Social Network Sampling, Graph Measurement, Genetic Algorithm Optimization
相關次數: 點閱:183下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

如何視覺化大量社群網路資料並萃取出其中有用的資訊一直是一件具有挑戰性的事。過往的文獻顯示出,社群網路的簡化或抽樣多應用於簡化網路的同時保留具影響力的因子。然而,若僅著重於找出網路中的意見領袖或是影響因子可能會忽略整體網路的樣貌、遺失一些潛在的可能性。本研究提出一以分群為基底之最佳化社群網路抽樣架構,在簡化社群網路資料的同時維持其原始的網路樣貌。此研究提出兩種社群網路分群數據的抽樣最佳化設計: 1)固定比例抽樣法(Fixed Ratio Sampling, FRS)在各群中選擇相同的抽樣比例進行抽樣最佳化;2)調整比例抽樣法(Adjusted Ratio Sampling, ARS)則是根據本研究所提出的正規化抽樣函數(Normalized Adjusted Ratio Sampling, NARS),依照資料的分佈分配不同的抽樣比例給各群,以進行抽樣最佳化。此研究應用最廣為使用的啟發式演算法之一,基因演算法,作為最佳化的演算法並針對提出的聚合相似度距離指標(Aggregated Similarity Distance, ASD)進行原始網路和簡化網路間的相似度距離最小化,藉此找出既少量但又能呈現資料原貌的資料點。研究結果顯示出,在相同的計算資源下,ARS演算法相較於FRS演算法的最佳化結果可以取得更接近原始網路的樣貌。此外,正規化計算時間和ASD指標的分析驗證了當抽樣比例縮減至原始網路的40%~60%的區間時,其結果最具經濟效益。


When data in a social network is massive, showing a large graph with meaningful visualization is challenging. Therefore, network reduction or sampling is commonly used to simplify the network and keep the influential components only. Many network reduction or sampling studies focused on identifying the influencers or opinion leaders in social network. However, only focusing the influencers or opinion leaders might lose the overall profiling of the whole network. In this research, a metaheuristic-based optimization framework for cluster-based social network data sampling was proposed to extract representative information from an original network without losing the network profiling. Two sampling methodology designs were developed to optimized data sampling: 1) Fixed ratio sampling (FRS) which selects a fixed proportion of data in each cluster; 2) Adjusted ratio sampling (ARS) which selects different proportion of data by the proposed normalized adjusted ratio sampling (NARS) function in each cluster. Genetic algorithm (GA), one of the most popular metaheuristic methods, was used to optimize the similarity of the original and reduced networks based on the proposed aggregated similarity distance (ASD). The experiment result showed that the proposed ARS outperformed FRS with smaller ASD. ARS can produce graph that is more similar to the original network under the same computational resource. Moreover, the analysis of normalized computational time and ASD verified that sampling rate in the range of 40% to 60% is the most economical.

摘要 i ABSTRACT ii 致謝 iii TABLE OF CONTENTS iv LIST OF FIGURES vi LIST OF TABLES viii CHAPTER 1. INTRODUCTION 1 CHAPTER 2. LITERATURE REVIEW 5 2.1. Social Network Measure 5 2.1.1. Degree Centralization 5 2.1.2. Graph Density 6 2.1.3. Graph Entropy 6 2.1.4. Average Clustering Coefficient 7 2.2. Social Network Clustering 9 2.3. Social Network Sampling 12 CHAPTER 3. METHODOLOGY 14 3.1. Hypothesis 16 3.2. Normalized Adjusted Ratio Sampling Function (NARS) 17 3.3. Genetic Algorithm Design 18 3.3.1. Selection 20 3.3.2. Crossover 21 3.3.3. Mutation 21 3.4. Proposed Social Network Indicators 22 3.5. Aggregated Similarity Distance (ASD) 24 CHAPTER 4. EXPERIMENTS AND RESULTS 25 4.1. Experiment Setting and Data Description 25 4.2. Single Representative Node Optimization - Experiment #1 28 4.2.1. Result of Experiment #1 29 4.3. Fixed Ratio Sampling Optimization – Experiment #2 30 4.3.1. Result of Experiment #2 31 4.4. Adjusted Ratio Sampling Optimization - Experiment #3 33 4.4.1. Result of Experiment #3 33 4.5. Comparison of Experiment #2 and Experiment #3 35 4.6. Discussion of Proportional ASD in each Transformed Graph 37 4.7. Discussion of Implementation 39 4.7.1. Comparison of the trend of ASD in another two days’ data 39 4.7.2. Comparison of the distribution of ASD proportion in another two days’ data 40 4.7.3. Comparison of NASD and NCT in another two days’ data 41 CHAPTER 5. CONCLUSION 44 REFERENCES 46 APPENDIX 50

[1] O. Zincir, "Knowledge Workers' Social Media Usage as a Personal Knowledge Management Tool," 2017, pp. 108-124.
[2] M. Kaple, K. Kulkarni, and K. Potika, "Viral Marketing for Smart Cities: Influencers in Social Network Communities," in 2017 IEEE Third International Conference on Big Data Computing Service and Applications (BigDataService), 2017, pp. 106-111.
[3] S. Arrami, W. Oueslati, and J. Akaichi, "Detection of Opinion Leaders in Social Networks: A Survey," in Intelligent Interactive Multimedia Systems and Services 2017, Cham, 2018, pp. 362-370: Springer International Publishing.
[4] Y. Liu, T. Safavi, A. Dighe, and D. Koutra, "Graph Summarization Methods and Applications: A Survey," vol. 51, no. 3 %J ACM Comput. Surv., p. Article 62, 2018.
[5] C. Bhaumik, A. K. Agrawal, and P. Sinha, "Using social network graphs for search space reduction in internet of things," presented at the Proceedings of the 2012 ACM Conference on Ubiquitous Computing, Pittsburgh, Pennsylvania, 2012. Available: https://doi.org/10.1145/2370216.2370325
[6] F. Giglietto, N. Righetti, L. Rossi, and G. Marino, "It takes a village to manipulate the media: coordinated link sharing behavior during 2018 and 2019 Italian elections," Information, Communication & Society, vol. 23, pp. 1-25, 03/17 2020.
[7] M. Duggan, "The Demographics of Social Media Users," 2015, Available: https://www.pewresearch.org/internet/2015/08/19/the-demographics-of-social-media-users/.
[8] K. Kempf-Leonard, Encyclopedia of Social Measurement. Elsevier, 2004.
[9] L. C. Freeman, "Centrality in social networks conceptual clarification," Social Networks, vol. 1, no. 3, pp. 215-239, 1978/01/01/ 1978.
[10] E. Otte and R. J. J. o. i. S. Rousseau, "Social network analysis: a powerful strategy, also for the information sciences," vol. 28, no. 6, pp. 441-453, 2002.
[11] C. E. Shannon, "A mathematical theory of communication," The Bell System Technical Journal, vol. 27, no. 3, pp. 379-423, 1948.
[12] Y. M. Omar and P. Plapper, "A Survey of Information Entropy Metrics for Complex Networks," Entropy, vol. 22, no. 12, 2020.
[13] N. Rashevsky, "Life, information theory, and topology," The bulletin of mathematical biophysics, vol. 17, no. 3, pp. 229-235, 1955/09/01 1955.
[14] R. Todeschini and V. Consonni, Handbook of molecular descriptors. John Wiley & Sons, 2008.
[15] A. Mowshowitz, "Entropy and the complexity of graphs: I. An index of the relative complexity of a graph," The bulletin of mathematical biophysics, vol. 30, no. 1, pp. 175-204, 1968/03/01 1968.
[16] D. Ortiz-Arroyo and D. M. A. Hussain, "An Information Theory Approach to Identify Sets of Key Players," in Intelligence and Security Informatics, Berlin, Heidelberg, 2008, pp. 15-26: Springer Berlin Heidelberg.
[17] E. Serin and S. Balcisoy, "Entropy Based Sensitivity Analysis and Visualization of Social Networks," in 2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2012, pp. 1099-1104.
[18] F. Tutzauer, "Entropy as a measure of centrality in networks characterized by path-transfer flow," Social Networks, vol. 29, no. 2, pp. 249-265, 2007/05/01/ 2007.
[19] Q. Wang, G. Zeng, and X. Tu, "Information Technology Project Portfolio Implementation Process Optimization Based on Complex Network Theory and Entropy," Entropy, vol. 19, p. 287, 06/19 2017.
[20] D. J. Watts and S. H. Strogatz, "Collective dynamics of ‘small-world’ networks," Nature, vol. 393, no. 6684, pp. 440-442, 1998/06/01 1998.
[21] J.-P. Onnela, J. Saramäki, J. Kertész, and K. Kaski, "Intensity and coherence of motifs in weighted complex networks," Physical Review E, vol. 71, no. 6, p. 065103, 06/13/ 2005.
[22] B. Khan and M. J. A. Niazi, "Network Community Detection: A Review and Visual Survey," vol. abs/1708.00977, 2017.
[23] J. Zhu, B. Wang, B. Wu, and W. J. I. T. I. S. Zhang, "Emotional Community Detection in Social Network," vol. 100-D, pp. 2515-2525, 2017.
[24] C. Li and Y. Zhang, "A personalized recommendation algorithm based on large-scale real micro-blog data," Neural Computing and Applications, vol. 32, no. 15, pp. 11245-11252, 2020/08/01 2020.
[25] D. Wang, J. Li, K. Xu, and Y. Wu, "Sentiment community detection: exploring sentiments and relationships in social networks," Electronic Commerce Research, vol. 17, no. 1, pp. 103-132, 2017/03/01 2017.
[26] P. Bedi and C. Sharma, "Community detection in social networks," WIREs Data Mining and Knowledge Discovery, https://doi.org/10.1002/widm.1178 vol. 6, no. 3, pp. 115-135, 2016/05/01 2016.
[27] B. W. Kernighan and S. Lin, "An efficient heuristic procedure for partitioning graphs," The Bell System Technical Journal, vol. 49, no. 2, pp. 291-307, 1970.
[28] J. C. Bezdek, Pattern recognition with fuzzy objective function algorithms. Springer Science & Business Media, 2013.
[29] S. Fortunato, "Community detection in graphs," Physics Reports, vol. 486, no. 3, pp. 75-174, 2010/02/01/ 2010.
[30] T. Murata, "Detecting Communities in Social Networks," in Handbook of Social Network Technologies and Applications, B. Furht, Ed. Boston, MA: Springer US, 2010, pp. 269-280.
[31] M. Girvan and M. E. J. Newman, "Community structure in social and biological networks," Proceedings of the National Academy of Sciences, vol. 99, no. 12, p. 7821, 2002.
[32] F. Madani, "‘Technology Mining’ bibliometrics analysis: applying network analysis and cluster analysis," Scientometrics, vol. 105, no. 1, pp. 323-335, 2015/10/01 2015.
[33] M. E. J. Newman and M. Girvan, "Finding and evaluating community structure in networks," Physical Review E, vol. 69, no. 2, p. 026113, 02/26/ 2004.
[34] A. Clauset, M. E. J. Newman, and C. Moore, "Finding community structure in very large networks," Physical Review E, vol. 70, no. 6, p. 066111, 12/06/ 2004.
[35] R. Guimerà and L. A. Nunes Amaral, "Functional cartography of complex metabolic networks," Nature, vol. 433, no. 7028, pp. 895-900, 2005/02/01 2005.
[36] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, "Fast unfolding of communities in large networks," Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, no. 10, p. P10008, 2008/10/09 2008.
[37] A. K. Bhowmick, K. Meneni, M. Danisch, J.-L. Guillaume, and B. Mitra, "LouvainNE: Hierarchical Louvain Method for High Quality and Scalable Network Embedding," presented at the Proceedings of the 13th International Conference on Web Search and Data Mining, Houston, TX, USA, 2020. Available: https://doi.org/10.1145/3336191.3371800
[38] A. Adam, J.-C. Delvenne, and I. Thomas, "Detecting communities with the multi-scale Louvain method: robustness test on the metropolitan area of Brussels," Journal of Geographical Systems, vol. 20, no. 4, pp. 363-386, 2018/10/01 2018.
[39] B. Lengyel, A. Varga, B. Ságvári, Á. Jakobi, and J. Kertész, "Geographies of an Online Social Network," PLOS ONE, vol. 10, no. 9, p. e0137248, 2015.
[40] S. Tsugawa and K. Kimura, "Identifying influencers from sampled social networks," Physica A: Statistical Mechanics and its Applications, vol. 507, pp. 294-303, 2018/10/01/ 2018.
[41] S. Tsugawa and H. Ohsaki, "Benefits of Bias in Crawl-Based Network Sampling for Identifying Key Node Set," IEEE Access, vol. 8, pp. 75370-75380, 2020.
[42] L. J. C. Lovász, Paul erdos is eighty, "Random walks on graphs: A survey," vol. 2, no. 1, pp. 1-46, 1993.
[43] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to algorithms. MIT press, 2009.
[44] A. S. Maiya and T. Y. Berger-Wolf, "Benefits of bias: towards better characterization of network sampling," presented at the Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, San Diego, California, USA, 2011. Available: https://doi.org/10.1145/2020408.2020431
[45] W. Chen, Y. Wang, and S. Yang, "Efficient influence maximization in social networks," presented at the Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, Paris, France, 2009. Available: https://doi.org/10.1145/1557019.1557047
[46] J. Tang, C. Zhang, K. Cai, L. Zhang, and Z. Su, "Sampling Representative Users from Large Social Networks," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 29, no. 1, 02/09 2015.
[47] J. Tang, X. Tang, and J. Yuan, "Towards Profit Maximization for Online Social Network Providers," in IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, 2018, pp. 1178-1186.
[48] J. H. Holland, Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT press, 1992.
[49] D. Whitley, "A genetic algorithm tutorial," Statistics and Computing, vol. 4, no. 2, pp. 65-85, 1994/06/01 1994.
[50] A. Thengade and R. Dondal, "Genetic algorithm–survey paper," in MPGI National Multi Conference, 2012, pp. 7-8: Citeseer.

無法下載圖示 全文公開日期 2024/07/27 (校內網路)
全文公開日期 2026/07/27 (校外網路)
全文公開日期 2026/07/27 (國家圖書館:臺灣博碩士論文系統)
QR CODE