簡易檢索 / 詳目顯示

研究生: 梁博為
Po-Wei Liang
論文名稱: 社群媒體資料的情感分析系統
System of Sentiment Analysis for Social Media Data
指導教授: 戴碧如
Bi-Ru Dai
口試委員: 鲍興國
Hsing-Kuo Pao
戴志華
Chih-Hua Tai
蔡曉萍
Hsiao-Ping Tsai
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2013
畢業學年度: 101
語文別: 英文
論文頁數: 37
中文關鍵詞: 微型部落格語意分析意見探勘
外文關鍵詞: Microblogging, Sentiment analysis, Opinion Minin
相關次數: 點閱:257下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 微型部落格(Twitter或Facebook)近年來在互聯網用戶中已成為非常流行的通訊工具。在這些微型部落格上有很多的使用者分享他們對生活中各個主題的想法以及意見。訊息藉由一個人透過電腦或是手機來產生以及管理, 然而大部分的訊息都是文字的訊息. 那這些意見是可以幫助其他使用者參考以及幫助他們做決定. 所以如果可以幫助使用者去收集以及分析這些資料, 這是一件很有意義的事. 然而這件事情是非常具有挑戰性, 因為在微型部落格上的訊息通常是很短的而且非常口語的, 傳統意見探勘的演算法在處理這類型的資料並沒有辦法得到很好的效能. 因此在這篇論文中, 我們提出一個可以自動幫助分析語意的系統架構. 我們使用手動標記過的Twitter資料. 那Twitter是一個很受歡迎的微型部落格. 在此系統中機器學習的演算法會自動的取出訊息中含有意見的把一些不含意見的訊息給過濾掉接著判斷這些含有意見的訊息的語意方向(正面的, 負面的). 最後的實驗結果顯示我們的系統在處理微型部落格語意分析應用的方面得到很好的結果.


    Microblogging (Twitter or Facebook) has become a very popular communication tool among Internet users in recent years. Information is generated and managed through either computer or mobile devices by one person and is consumed by many other persons, with most of this user-generated content being textual information. As there are a lot of raw data of people posting real time messages about their opinions on a variety of topics in daily life, it is a worthwhile research endeavor to collect and analyze these data, which may be useful for users or managers to make informed decisions, for example. However this problem is challenging because a micro-blog post is usually very short and colloquial, and traditional opinion mining algorithms do not work well in such type of text. Therefore, in this paper, we propose a new system architecture that can automatically analyze the sentiments of these messages. We combine this system with manually annotated data from Twitter, one of the most popular microblogging platforms, for the task of sentiment analysis. In this system, machines can learn how to automatically extract the set of messages which contain opinions, filter out non- opinion messages and determine their sentiment directions (i.e. positive, negative). Experimental results verify the effectiveness of our system on sentiment analysis in real microblogging applications.

    致 謝 IV ABSTRACT 0 論文摘要 1 TABLE OF CONTENTS 2 LIST OF FIGURES 4 LIST OF TABLES 5 1. INTRODUCTION 6 1.1 BACKGROUND 6 1.2 MOTIVATION AND CONTRIBUTION 8 1.3 THESIS ORGANIZATION 8 2. RELATED WORKS 9 2.1 OPINION RETRIEVAL 9 2.2 OPINION MINING 9 3. SYSTEM ARCHITECTURE 11 3.1 PREPROCESSING 13 3.2 SUPERVISED LEARNING ALGORITHM 15 3.3 FEATURE SELECTION 17 3.4 SAMPLING 18 4 EXPERIMENT STUDY 20 4.1 DATA SET AND EVALUATION CRITERIA 20 4.2 EXPERIMENTAL RESULTS 21 5 CONCLUSION AND FUTURE WORKS 29 REFERENCE 30

    1. Fabrizio Sebastiani. Machine learning in automated text categorisation. Technical Report IEI-B4-31-1999, Istituto di Elaborazione dell’Informazione, 2001.
    2. B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 79–86, 2002.
    3. P. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. ACL’02, 2002.
    4. K. Dave, S. Lawrence, and D. Pennock. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. WWW’03, 2003.
    5. Go, A., Huang, L., Bhayani, R.: Twitter sentiment classification using distant supervision. In: CS224N Project Report, Stanford (2009).
    6. Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. Proceedings of LREC.
    7. N. V. Chawla, N. Japkowicz, and A. Kotcz, Eds., Proc. ICML Workshop Learn. Imbalanced Data Sets, 2003.
    8. N. Japkowicz, Ed., Proc. AAAI Workshop Learn. Imbalanced Data Sets, 2000.
    9. G. M. Weiss, “Mining with rarity: A unifying framework,” ACM SIGKDD Explor. Newslett., vol. 6, no. 1, pp. 7–19, Jun. 2004.
    10. Zhang,W., Yu,C., Meng,W.: Opinion retrieval from blogs. In: CIKM’07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, New York, NY, USA, ACM (2007) 831–840
    11. S Liu, F Liu, C Yu, and W Meng. An Effective Approach to Document Retrieval via Utilizing WordNet and Recognizing Phrases. In Proceedings of the 27th SIGIR. 2004.
    12. Zhang, Q., Wang, B., Wu, L., Huang, X.: Fdu at trec 2007: Opinion retrieval of blog track. In: Proceedings of the Sixteenth Text REtrieval Conference (TREC2007).
    13. M. Gamon, A. Aue, S. Corston-Oliver, and E. K. Ringger. Pulse: Mining customer opinions from free text. IDA’2005.
    14. M. Hu and B. Liu. Mining and summarizing customer reviews. KDD’04, 2004.
    15. S. Kim and E. Hovy. Determining the Sentiment of Opinions. COLING’04, 2004.
    16. A-M. Popescu and O. Etzioni. Extracting Product Features and Opinions from Reviews. EMNLP-05, 2005.
    17. Kunpeng Zhang, Y u Cheng, Y usheng Xie, Daniel Honbo, Ankit Agrawal, Diana Palsetia , Kathy Lee , Wei-keng Liao , Alok Choudhary, SES: Sentiment Elicitation System for Social Media Data, Proceedings of the 2011 IEEE 11th International Conference on Data Mining.
    18. Apoorv Agarwal, Boyi Xie, Ilia Vovsha, Owen Rambow, and Rebecca Passonneau. 2011. Sentiment analysis of twitter data. In Proceedings of the Workshop on Languages in Social Media, pages 30–38. Association for Computational Linguistics.
    19. X. Ding, B. Liu, and P. S. Yu, “A holistic lexicon-based approach to opinion mining,” Proceedings of the Conference on Web Search and Web Data Mining (WSDM), 2008.
    20. Wei Jin, Hung Hay Ho, Rohini K. Srihari, OpinionMiner: a novel machine learning system for web opinion mining and extraction, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France.
    21. J. Read. Using emoticons to reduce dependency in machine learning techniques for sentiment classification. In Proceedings of ACL-05, 43nd Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, 2005.
    22. H. Schmid. Treetagger. In TC project at the Institute for Computational Linguistics of the University of Stuttgart, 1994.
    23. Dumais, Susan, et al. "Inductive learning algorithms and representations for text categorization." Proceedings of the seventh international conference on Information and knowledge management. ACM, 1998.
    24. AGARWAL, A., XIE, B., VOVSHA, I., RAMBOW, O., AND PASSONNEAU, R. Sentiment analysis of twitter data. In Proceedings of the ACL 2011 Workshop on Languages in Social Media (2011).
    25. V. Vapnic, The Nature of Statistical Learning Theory, Springer, New York, 1995.
    26. J.W. Han, M. Kamber, Data Mining Concepts and Techniques, second ed., Morgan Kaufmann Publishers, 2006.
    27. Chang, Chih-Chung, and Chih-Jen Lin. "LIBSVM: a library for support vector machines." ACM Transactions on Intelligent Systems and Technology (TIST) 2.3 (2011): 27.
    28. A. Estabrooks, T. Jo, and N. Japkowicz, “A Multiple Resampling Method for Learning from Imbalanced Data Sets,” Computational Intelligence, vol. 20, pp. 18-36, 2004.
    29. N.V. Chawla, A. Lazarevic, L.O. Hall, and K.W. Bowyer, “SMOTEBoost: Improving Prediction of the Minority Class in Boosting,” Proc. Seventh European Conf. Principles and Practice of Knowledge Discovery in Databases, pp. 107-119, 2003.
    30. Sriram, Bharath, et al. "Short text classification in twitter to improve information filtering." Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval. ACM, 2010.

    無法下載圖示 全文公開日期 2018/07/29 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE