簡易檢索 / 詳目顯示

研究生: 施欣彤
Hsin-Tung Shih
論文名稱: 以多元目標最佳化建構情緒關鍵字搜尋法-以過濾貼文及回應之關連性為例
Filter-Based Sentiment Keyword Search for Investigating the Relationship Between Posts and Comments Under Multi-Objective Optimization Framework
指導教授: 楊朝龍
Chao-Long Yang
口試委員: 黃奎隆
Kwei-Long Huang
林希偉
Shi-Woei Lin
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 76
中文關鍵詞: 多目標優化NSGA-III社群網路分析Fuzzy C-MeansTF-IDF情感詞典
外文關鍵詞: multi-objective optimization, NSGA-III, social network analysis, Fuzzy C-Means, TF-IDF, sentiment lexicons
相關次數: 點閱:212下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • Facebook、Twitter 和 Instagram 等社群媒體平台如今越來越受歡迎,因為它們讓使用者可以在網路上創作、分享資訊和評論與日常生活相關的任何內容。此類訊息對許多不同領域特別有用,例如線上廣告等。因此,對貼文及回應的社交網路分析已成為一個重要的研究領域。以前的研究側重於分析貼文及回應的正負性,而不是它們之間的關係。因此,本研究提出了一個多目標優化框架來研究貼文及回應之間的關係。在此框架中,我們將情感詞典和 TF-IDF 組合用於查找可以代表貼文及回應的關鍵字,再將 Fuzzy C-Means (FCM) 方法用於將相似的貼文及回應合併為數個群集。再套用多目標優化演算法 NSGA-III 用於優化本研究的三個目標:(1)文章隸屬程度平均變異值,(2)留言隸屬程度平均變異值,以及(3)預測留言結果平均均方根誤差。本研究試圖透過提出的框架尋找貼文與其對應的回應,因為透過找出特定關鍵字引起的特定回應以及將收到的回應類型,我們可以預測貼文將得到的回應種類。研究結果表明,所提出的框架可以幫助找出貼文及回應之間的關係。後續與單目標優化算法 GA 的進一步比較證實了NSGA-III 需要獲得可為調查提供洞察力的結果。


    Online social media platforms like Facebook, Twitter, and Instagram are getting
    more and more popular today because they allow users to create, share, and comment
    on anything connected to their daily lives. Such information is particularly helpful for many different fields, such as online advertising. Therefore, social network analysis on posts and comments has become an important research area. Previous studies focused on analyzing the positive or negative polarity of posts and comments, however not the relationships between them. In this research, a multi-objective optimization framework was proposed to investigate the relationship between posts and comments. A
    combination of sentiment lexicons and TF-IDF were used to find keywords that could
    represent posts and comments. Fuzzy C-Means (FCM) clustering method was used to
    cluster similar posts and comments together. We try to find out what comments will be
    the response to the posts. Because by finding out what keywords were eliciting
    responses from other users and the type of reactions they will receive, we can predict
    what kind of responses each post will have. This is useful for marketing or sales team
    to analyze what keywords would attract the attention of consumers or customers.
    NSGA-III was utilized to optimize the three objectives of this research: (1) the average variance of post membership, (2) the average variance of comment membership, and (3) the root-mean-square error of prediction. The results of the research shows that the framework proposed can help to find the relationship between posts and comments.
    Further comparison with single-objective optimization algorithm, GA, shows the
    advantages of using NSGA-III to solve this research problem.

    摘要 i ABSTRACT ii 致謝 iii TABLE OF CONTENTS iv LIST OF FIGURES vi LIST OF TABLES viii CHAPTER 1. INTRODUCTION 1 CHAPTER 2. LITERATURE REVIEW 5 2.1. Sentiment Analysis 5 2.2. Keyword Extraction Methods 6 2.3. Multi-objective Optimization Algorithms 7 CHAPTER 3. METHODOLOGY 9 3.1. Keyword filter 10 3.1.1. Affective Norms for English Words 11 3.1.2. SENTIWORDNET 3.0 11 3.1.3. Valence Aware Dictionary and sEntiment Reasoner 13 3.2. List of sorted important keywords 14 3.3. Matrix of Post Keywords and Matrix of Comment Keywords 16 3.4. Fuzzy C-Means Clustering 18 3.5. Random Forest Prediction 21 3.6. Multi-optimization Framework 21 3.6.1. Crossover and Mutation 23 3.6.2. Evaluation 25 3.6.3. Pareto front 26 3.6.4. Non-dominated sorting algorithm (NSGA-III) 27 CHAPTER 4. EXPERIMENTS AND RESULTS 34 4.1. Data Description 34 4.2. Data Preprocessing 34 4.3. Selection of decision variable 37 4.4. Optimal solutions 38 4.5. Cluster analysis 40 4.6. Compare with Genetic Algorithm (GA) 43 CHAPTER 5. CONCLUSION 47 5.1. Conclusion 47 5.2. Future work and discussion 48 REFERENCES 50 APPENDIX 55

    B. Liu, "Social Network Analysis," in Web Data Mining: Exploring Hyperlinks,
    Contents, and Usage Data. Berlin, Heidelberg: Springer Berlin Heidelberg,
    2011, pp. 269-309.
    [2] C. T. Butts, "Social network analysis: A methodological introduction," Asian
    Journal of Social Psychology, vol. 11, no. 1, pp. 13-41, 28 February 2008.
    [3] G. G. Chowdhury, "Natural language processing," Annual review of information
    science and technology, vol. 37, no. 1, pp. 51-89, 31 January 2005.
    [4] J. Hirschberg and C. D. Manning, "Advances in natural language processing,"
    Science, vol. 349, no. 6245, pp. 261-266, 17 July 2015.
    [5] B. Liu, "Sentiment analysis and opinion mining," Synthesis lectures on human
    language technologies, vol. 5, no. 1, pp. 1-167, 2012, doi:
    10.2200/S00416ED1V01Y201204HLT016.
    [6] N. C. Dang, M. N. Moreno-García, and F. De la Prieta, "Sentiment analysis
    based on deep learning: A comparative study," Electronics, vol. 9, no. 3, p. 483,
    14 March 2020.
    [7] M.-A. Kaufhold, M. Bayer, and C. Reuter, "Rapid relevance classification of
    social media posts in disasters and emergencies: A system and evaluation
    featuring active, incremental and online learning," Information Processing &
    Management, vol. 57, no. 1, p. 102132, Janurary 2020.
    [8] M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, N. Farra, and R. Kumar,
    "Predicting the Type and Target of Offensive Posts in Social Media," arXiv
    preprint arXiv:1902.09666, 25 Feburary 2019.
    [9] R. M. Merchant et al., "Evaluating the predictability of medical conditions from
    social media posts," PloS one, vol. 14, no. 6, p. e0215476, 2019.
    [10] R. Feldman, "Techniques and applications for sentiment analysis,"
    Communications of the ACM, vol. 56, no. 4, pp. 82-89, April 2013, doi:
    10.1145/2436256.2436274.
    [11] S. L. Lo, E. Cambria, R. Chiong, and D. Cornforth, "Multilingual sentiment
    analysis: from formal to informal and scarce resource languages," Artificial
    Intelligence Review, vol. 48, no. 4, pp. 499-527, 2017.
    [12] Z. Drus and H. Khalid, "Sentiment analysis in social media and its application:
    Systematic literature review," Procedia Computer Science, vol. 161, pp. 707-
    714, 2019.
    [13] O. C. Hang and H. M. Dahlan, "Cyberbullying lexicon for social media," in
    2019 6th International Conference on Research and Innovation in Information
    Systems (ICRIIS), Johor Bahru, Malaysia, 02-03 December 2019: IEEE, pp. 1-6.
    [14] S. A. El Rahman, F. A. AlOtaibi, and W. A. AlShehri, "Sentiment analysis of
    twitter data," in 2019 international conference on computer and information
    sciences (ICCIS), Sakaka, Saudi Arabia, 03-04 April 2019: IEEE, pp. 1-4.
    [15] M. Ahmad, S. Aftab, S. S. Muhammad, and S. Ahmad, "Machine learning
    techniques for sentiment analysis: A review," Int. J. Multidiscip. Sci. Eng, vol.
    8, no. 3, p. 27, April 2017.
    [16] R. Rawat, V. Mahor, S. Chirgaiya, R. N. Shaw, and A. Ghosh, "Sentiment
    analysis at online social network for cyber-malicious post reviews using
    machine learning techniques," in Computationally Intelligent Systems and their
    Applications: Springer, 2021, pp. 113-130.
    [17] K.-X. Han, W. Chien, C.-C. Chiu, and Y.-T. Cheng, "Application of support
    vector machine (SVM) in the sentiment analysis of twitter dataset," Applied
    Sciences, vol. 10, no. 3, p. 1125, 2020.
    [18] S. Beliga, A. Meštrović, and S. Martinčić-Ipšić, "An overview of graph-based
    keyword extraction methods and approaches," Journal of information and
    organizational sciences, vol. 39, no. 1, pp. 1-20, 2015.
    [19] A. Xiong, D. Liu, H. Tian, Z. Liu, P. Yu, and M. Kadoch, "News keyword
    extraction algorithm based on semantic clustering and word graph model,"
    Tsinghua Science and Technology, vol. 26, no. 6, pp. 886-893, 2021.
    [20] H. Shah, M. U. Khan, and P. Fränti, "H-rank: a keywords extraction method
    from web pages using POS tags," in 2019 IEEE 17th International Conference
    on Industrial Informatics (INDIN), Helsinki, Finland, July 2019, vol. 1: IEEE,
    pp. 264-269.
    [21] L. Marujo et al., "Automatic keyword extraction on twitter," in Proceedings of
    the 53rd Annual Meeting of the Association for Computational Linguistics and
    the 7th International Joint Conference on Natural Language Processing
    (Volume 2: Short Papers), Beijing, China, July 2015: Association for
    Computational Linguistics, pp. 637-643.
    [22] F. Liu, X. Huang, W. Huang, and S. X. Duan, "Performance evaluation of
    keyword extraction methods and visualization for student online comments,"
    Symmetry, vol. 12, no. 11, p. 1923, Nov. 2020.
    [23] S. Beliga, "Keyword extraction: a review of methods and approaches,"
    University of Rijeka, Department of Informatics, Rijeka, vol. 1, no. 9, 2014.
    [24] C. Sun, L. Hu, S. Li, T. Li, H. Li, and L. Chi, "A review of unsupervised
    keyphrase extraction methods using within-collection resources," Symmetry, vol.
    12, no. 11, p. 1864, Nov. 2020.
    [25] S. R. El-Beltagy and A. Rafea, "KP-Miner: A keyphrase extraction system for
    English and Arabic documents," Information systems, vol. 34, no. 1, pp. 132-
    144, March 2009.
    [26] S. Rose, D. Engel, N. Cramer, and W. Cowley, "Automatic keyword extraction
    from individual documents," Text mining: applications and theory, vol. 1, pp.
    1-20, 2010.
    [27] R. Campos, V. Mangaravite, A. Pasquali, A. Jorge, C. Nunes, and A. Jatowt,
    "YAKE! Keyword extraction from single documents using multiple local
    features," Information Sciences, vol. 509, pp. 257-289, January 2020.
    [28] R. Mihalcea and P. Tarau, "Textrank: Bringing order into text," in Proceedings
    of the 2004 conference on empirical methods in natural language processing,
    2004, pp. 404-411.
    [29] C. Florescu and C. Caragea, "Positionrank: An unsupervised approach to
    keyphrase extraction from scholarly documents," in Proceedings of the 55th
    Annual Meeting of the Association for Computational Linguistics (Volume 1:
    Long Papers), 2017, pp. 1105-1115.
    [30] A. Bougouin, F. Boudin, and B. Daille, "Topicrank: Graph-based topic ranking
    for keyphrase extraction," in International joint conference on natural language
    processing (IJCNLP), 2013, pp. 543-551.
    [31] L. Yao, Z. Pengzhou, and Z. Chi, "Research on News Keyword Extraction
    Technology Based on TF-IDF and TextRank," in 2019 IEEE/ACIS 18th
    International Conference on Computer and Information Science (ICIS), Beijing,
    China, 17-19 June 2019, pp. 452-455, doi: 10.1109/ICIS46139.2019.8940293.
    [32] J. Zhou, H. Zogan, S. Yang, S. Jameel, G. Xu, and F. Chen, "Detecting
    community depression dynamics due to covid-19 pandemic in australia," IEEE
    Transactions on Computational Social Systems, vol. 8, no. 4, pp. 982-991, 15
    January 2021.
    [33] Y. Cui, Z. Geng, Q. Zhu, and Y. Han, "Multi-objective optimization methods
    and application in energy saving," Energy, vol. 125, pp. 681-704, 15 April 2017.
    [34] H. Afshari, W. Hare, and S. Tesfamariam, "Constrained multi-objective
    optimization algorithms: Review and comparison with application in reinforced
    concrete structures," Applied Soft Computing, vol. 83, p. 105631, October 2019.
    [35] A. Konak, D. W. Coit, and A. E. Smith, "Multi-objective optimization using
    genetic algorithms: A tutorial," Reliability engineering & system safety, vol. 91,
    no. 9, pp. 992-1007, September 2006.
    [36] X. Xu, Y. Chen, Y. Yuan, T. Huang, X. Zhang, and L. Qi, "Blockchain-based
    cloudlet management for multimedia workflow in mobile cloud computing,"
    Multimedia Tools and Applications, vol. 79, no. 15, pp. 9819-9844, 10 July 2020.
    [37] G. C. Ciro, F. Dugardin, F. Yalaoui, and R. Kelly, "A NSGA-II and NSGA-III comparison for solving an open shop scheduling problem with resource
    constraints," IFAC-PapersOnLine, vol. 49, no. 12, pp. 1272-1277, 2016.
    [38] L. Salinas San Martin, J. Yang, and Y. Liu, "Hybrid NSGA III/dual simplex
    approach to generation and transmission maintenance scheduling,"
    International Journal of Electrical Power & Energy Systems, vol. 135, p.
    107498, February 2022.
    [39] W. Du, "Data Analysis Framework of Searching the Association Between Posts
    and Comments on Social Network," Master, Industrial Management, National
    Taiwan University of Science and Technology, Taipei, Taiwan, 2021.
    [40] M. M. Bradley and P. J. Lang, "Affective norms for English words (ANEW):
    Instruction manual and affective ratings," Technical Report C-1, The Center for
    Research in Psychophysiology, University of Florida, 1999.
    [41] S. Baccianella, A. Esuli, and F. Sebastiani, "Sentiwordnet 3.0: An enhanced
    lexical resource for sentiment analysis and opinion mining," in Proceedings of
    the Seventh International Conference on Language Resources and Evaluation
    (LREC'10), Valletta, Malta, 2010.
    [42] C. Hutto and E. Gilbert, "Vader: A Parsimonious Rule-based Model for
    Sentiment Analysis of Social Media Text," in Proceedings of the International
    AAAI Conference on Web and Social Media, 2014, vol. 8, no. 1, pp. 216-225.
    [43] A. Esuli and F. Sebastiani, "Sentiwordnet: A publicly available lexical resource
    for opinion mining," in Proceedings of the fifth international conference on
    language resources and evaluation, 2006, pp. 417-422.
    [44] S.-W. Kim and J.-M. Gil, "Research paper classification systems based on TF-IDF and LDA schemes," Human-centric Computing and Information Sciences,
    vol. 9, no. 1, pp. 1-21, 26 August 2019.
    [45] J. C. Bezdek, R. Ehrlich, and W. Full, "FCM: The Fuzzy C-Means Clustering
    Algorithm," Computers & Geosciences, vol. 10, no. 2, pp. 191-203, 1984.
    [46] L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5-32, 2001.
    [47] K. Deb and H. Jain, "An evolutionary many-objective optimization algorithm
    using reference-point-based nondominated sorting approach, part I: solving
    problems with box constraints," IEEE transactions on evolutionary
    computation, vol. 18, no. 4, pp. 577-601, 2013.
    [48] S. Kukkonen and K. Deb, "Improved pruning of non-dominated solutions
    based on crowding distance for bi-objective optimization problems," in 2006
    IEEE International Conference on Evolutionary Computation, 2006: IEEE, pp.
    1179-1186.
    [49] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan, "A fast elitist non-dominated
    sorting genetic algorithm for multi-objective optimization: NSGA-II," in
    International conference on parallel problem solving from nature, 2000:
    Springer, pp. 849-858.
    [50] I. Das and J. E. Dennis, "Normal-boundary intersection: A new method for
    generating the Pareto surface in nonlinear multicriteria optimization problems,"
    SIAM journal on optimization, vol. 8, no. 3, pp. 631-657, 1998.
    [51] M. Chirico. Cheltenham's Facebook Groups, March 2022. [Online]. Available:
    https://www.kaggle.com/datasets/mchirico/cheltenham-s-facebook-group

    無法下載圖示 全文公開日期 2025/09/12 (校內網路)
    全文公開日期 2025/09/12 (校外網路)
    全文公開日期 2025/09/12 (國家圖書館:臺灣博碩士論文系統)
    QR CODE