結合文字探勘技術與BERT模型於多類別情緒辨識之研究｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	呂佳玲 Chia-Ling Lu
論文名稱：	結合文字探勘技術與BERT模型於多類別情緒辨識之研究 A Study on Multi-class Emotion recognition:Combining Text Mining Techniques and the BERT Model
指導教授：	呂永和 Yung-Ho Leu
口試委員:	呂永和 Yung-Ho Leu 楊維寧 Wei-Ning Yang 陳雲岫 Yun-Shiow Chen
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理系 Department of Information Management
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	54
中文關鍵詞：	BERT 、深度學習、情緒辨識、情緒辭典、文字探勘
外文關鍵詞：	BERT, Deep Learning, Emotion Recognition, Emotion Lexicon, Text Mining
相關次數：	點閱：794 下載：5
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，隨著網路的普及和社群媒體的盛行，網路上的資訊越來越多。人們在日常生活中經常依賴他人的建議與評論，無論是選購商品還是決定觀看電影。人們的情緒也反映他們對產品和服務的期望與反應。這促使企業運用情緒分析技術來做出商業決策，幫助他們制定更精確的市場策略，提升顧客的滿意度。因此，文本情緒辨識在現代自然語言處理中扮演著至關重要的角色。
文本情緒辨識面臨的挑戰來自於文字的多樣性和情緒的複雜性，同一個詞在不同語境中可能傳達截然不同的情感或意義，這增加情緒辨識的複雜性。因此，在本研究中，我們將語意特徵與文本中的情緒特徵結合起來。BERT語言模型有助於理解單詞的上下文意義，而TF-IDF則直接反映文本中的情緒關鍵詞。透過結合這兩者，我們能更全面和準確地分析與理解文本的情感表達，進而提升情緒辨識的效能和準確性。為了解決少見情感類別的資料不平衡問題，我們採用AEDA技術進行資料增強，顯著提升少數類別的辨識度。
本研究選用MELD資料集進行實作。實驗結果顯示，我們提出的架構提升5.36%的準確度，對少數類別的辨識度提升高達8%。我們不僅提高模型的整體準確性和穩定性，也成功優化情緒分類模型，為未來相關研究提供堅實的基礎。

In recent years, with the widespread use of the internet and the dominance of social media, a growing volume of information is being disseminated online. Individuals frequently depend on others' recommendations and feedback in their daily decisions, be it selecting products or choosing movies. People's emotional responses reveal their expectations and reactions to products and services, leading businesses to implement emotion recognition technology in their decision-making processes. This enables them to develop more accurate marketing strategies and enhance customer satisfaction. As a result, text emotion recognition has become a pivotal aspect of contemporary natural language processing.
The challenges of text emotion recognition arise from the diversity of language and the complexity of emotions. A single word can convey vastly different emotions or meanings depending on the context, complicating sentiment analysis. Thus, in our study, we integrated semantic features with emotional attributes in the text. The BERT language model aids in understanding the contextual nuances of words, while TF-IDF highlights the significant terms associated with emotions in the text. By merging these two approaches, we can more thoroughly and precisely interpret and analyze the emotional content in text, enhancing the effectiveness and accuracy of emotion recognition. Furthermore, to tackle the problem of data imbalance in infrequent emotion categories, we employed the AEDA technique. This data augmentation strategy boosts the model's ability to recognize rare emotion categories.
In our research, we used the MELD dataset for implementation. The experimental findings demonstrate that our proposed framework increases accuracy by 5.36% and improves recognition for minority classes by up to 8%. We have not only improved the overall accuracy and stability of the model but also successfully optimized the emotion classification model, providing a solid foundation for future related research.

摘要	I
ABSTRACT	II
ACKNOWLEDGEMENT	III
TABLE OF CONTENTS	IV
LIST OF FIGURES	VI
LIST OF TABLES	VII
Chapter 1	Introduction	1
1.1	Research Background and Motivation	1
1.2	Research Objective	4
1.3	Research Contribution	5
1.4	Research Overview	5
Chapter 2	Literature Review	6
2.1	Natural Language Processing	6
2.2	Transformer	7
2.3	Bidirectional Encoder Representations from    Transformers	9
2.3.1	Pre-training	10
2.3.2	Fine-Tuning BERT	12
2.4	Emotion Recognition	14
2.4.1	Emotion Lexicon	15
2.4.2	Emotion Recognition with Machine Learning	16
2.4.3	Term Frequency - Inverse Document Frequency	18
2.5	Data Augmentation	20
2.5.1	Easy Data Augmentation	21
2.5.2	An Easier Data Augmentation	23
Chapter 3	Research Methodology	25
3.1	Dataset	26
3.2	Data Preprocessing	27
3.3	Data Augmentation	28
3.4	Model Building	29
3.4.1	BERT Model	29
3.4.2	Textual Emotional Features	31
3.4.3	Classification	32
3.4.4	Model Evaluation	35
Chapter 4	Experiment Result	38
4.1	Experimental Environment	38
4.2	Experimental Result	39
Chapter 5	Conclusion	48
Reference	49
                                

[1] S. McCarthy and G. Alaghband, "Enhancing Financial Market Analysis and Prediction with Emotion Corpora and News Co-Occurrence Network," Journal of Risk and Financial Management, vol. 16, no. 4, 2023, doi: 10.3390/jrfm16040226.
[2] P. Koukaras, C. Nousi, and C. Tjortjis, "Stock Market Prediction Using Microblogging Sentiment Analysis and Machine Learning," Telecom, vol. 3, no. 2, pp. 358-378, 2022, doi: 10.3390/telecom3020019.
[3] M. Costola, O. Hinz, M. Nofer, and L. Pelizzon, "Machine learning sentiment analysis, COVID-19 news and stock market reactions," Res Int Bus Finance, vol. 64, p. 101881, Jan 2023, doi: 10.1016/j.ribaf.2023.101881.
[4] L. Nemes and A. Kiss, "Social media sentiment analysis based on COVID-19," Journal of Information and Telecommunication, vol. 5, no. 1, pp. 1-15, 2020, doi: 10.1080/24751839.2020.1790793.
[5] M. S. Md Suhaimin, M. H. Ahmad Hijazi, E. G. Moung, P. N. E. Nohuddin, S. Chua, and F. Coenen, "Social media sentiment analysis and opinion mining in public security: Taxonomy, trend analysis, issues and future directions," Journal of King Saud University - Computer and Information Sciences, vol. 35, no. 9, 2023, doi: 10.1016/j.jksuci.2023.101776.
[6] L. Bryan-Smith, J. Godsall, F. George, K. Egode, N. Dethlefs, and D. Parsons, "Real-time social media sentiment analysis for rapid impact assessment of floods," Computers & Geosciences, vol. 178, 2023, doi: 10.1016/j.cageo.2023.105405.
[7] A. Patel, P. Oza, and S. Agrawal, "Sentiment Analysis of Customer Feedback and Reviews for Airline Services using Language Representation Model," Procedia Computer Science, vol. 218, pp. 2459-2467, 2023/01/01/ 2023, doi: https://doi.org/10.1016/j.procs.2023.01.221.
[8] Z. A. Diekson, M. R. B. Prakoso, M. S. Q. Putra, M. S. A. F. Syaputra, S. Achmad, and R. Sutoyo, "Sentiment analysis for customer review: Case study of Traveloka," Procedia Computer Science, vol. 216, pp. 682-690, 2023, doi: 10.1016/j.procs.2022.12.184.
[9] K. Barik, S. Misra, A. K. Ray, and A. Bokolo, "LSTM-DGWO-Based Sentiment Analysis Framework for Analyzing Online Customer Reviews," Comput Intell Neurosci, vol. 2023, p. 6348831, 2023, doi: 10.1155/2023/6348831.
[10] S. Al-Natour and O. Turetken, "A comparative assessment of sentiment analysis and star ratings for consumer reviews," International Journal of Information Management, vol. 54, 2020, doi: 10.1016/j.ijinfomgt.2020.102132.
[11] P. D. Turney, "Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews," arXiv preprint cs/0212032, 2002.
[12] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up? Sentiment classification using machine learning techniques," arXiv preprint cs/0205070, 2002.
[13] M. Koppel and J. Schler, "The Importance of Neutral Examples for Learning Sentiment," Computational Intelligence, vol. 22, no. 2, pp. 100-109, 2006, doi: 10.1111/j.1467-8640.2006.00276.x.
[14] A. M. El-Halees, "Arabic text classification using maximum entropy," IUG Journal of Natural Studies, vol. 15, no. 1, 2015.
[15] R. Plutchik, "A psychoevolutionary theory of emotions," ed: Sage Publications, 1982.
[16] P. Ekman, "An argument for basic emotions," Cognition & emotion, vol. 6, no. 3-4, pp. 169-200, 1992.
[17] S. Zad, M. Heidari, H. James Jr, and O. Uzuner, "Emotion detection of textual data: An interdisciplinary survey," in 2021 IEEE World AI IoT Congress (AIIoT), 2021: IEEE, pp. 0255-0261.
[18] Y. Wang, G. Huang, M. Li, Y. Li, X. Zhang, and H. Li, "Automatically Constructing a Fine-Grained Sentiment Lexicon for Sentiment Analysis," Cognitive Computation, vol. 15, no. 1, pp. 254-271, 2023/01/01 2023, doi: 10.1007/s12559-022-10043-1.
[19] L. P. Hung and S. Alias, "Beyond Sentiment Analysis: A Review of Recent Trends in Text Based Sentiment Analysis and Emotion Detection," Journal of Advanced Computational Intelligence and Intelligent Informatics, vol. 27, no. 1, pp. 84-95, 2023, doi: 10.20965/jaciii.2023.p0084.
[20] Y. Kang, Z. Cai, C.-W. Tan, Q. Huang, and H. Liu, "Natural language processing (NLP) in management research: A literature review," Journal of Management Analytics, vol. 7, no. 2, pp. 139-172, 2020.
[21] K. S. Jones, "Natural language processing: a historical review," Current issues in computational linguistics: in honour of Don Walker, pp. 3-16, 1994.
[22] K. Sekaran, P. Chandana, J. R. V. Jeny, M. N. Meqdad, and S. Kadry, "Design of optimal search engine using text summarization through artificial intelligence techniques," TELKOMNIKA (Telecommunication Computing Electronics and Control), vol. 18, no. 3, pp. 1268-1274, 2020.
[23] I. Balush, V. Vysotska, and S. Albota, "Recommendation System Development Based on Intelligent Search, NLP and Machine Learning Methods," in MoMLeT+ DS, 2021, pp. 584-617.
[24] Y. Shi, L. Zhu, W. Li, K. Guo, and Y. Zheng, "Survey on classic and latest textual sentiment analysis articles and techniques," International Journal of Information Technology & Decision Making, vol. 18, no. 04, pp. 1243-1287, 2019.
[25] N. Alswaidan and M. E. B. Menai, "A survey of state-of-the-art approaches for emotion recognition in text," Knowledge and Information Systems, vol. 62, no. 8, pp. 2937-2987, 2020.
[26] M. P. Sebastian, "Malayalam natural language processing: challenges in building a phrase-based statistical machine translation system," ACM transactions on Asian and low-resource language information processing, vol. 22, no. 4, pp. 1-51, 2023.
[27] S. H. Ahammad et al., "Improved neural machine translation using Natural Language Processing (NLP)," Multimedia Tools and Applications, vol. 83, no. 13, pp. 39335-39348, 2024.
[28] J. Wang, J. Wang, S. Dai, J. Yu, and K. Li, "Research on emotionally intelligent dialogue generation based on automatic dialogue system," arXiv preprint arXiv:2404.11447, 2024.
[29] T. Jorg et al., "Efficient structured reporting in radiology using an intelligent dialogue system based on speech recognition and natural language processing," Insights into imaging, vol. 14, no. 1, p. 47, 2023.
[30] J. J. Navjord and J.-M. R. Korsvik, "Beyond extractive: advancing abstractive automatic text summarization in norwegian with transformers," Norwegian University of Life Sciences, Ås, 2023.
[31] H. Jin, Y. Zhang, D. Meng, J. Wang, and J. Tan, "A comprehensive survey on process-oriented automatic text summarization with exploration of llm-based methods," arXiv preprint arXiv:2403.02901, 2024.
[32] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
[33] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.
[34] A. Jabbar, S. Iqbal, M. I. Tamimy, A. Rehman, S. A. Bahaj, and T. Saba, "An Analytical Analysis of Text Stemming Methodologies in Information Retrieval and Natural Language Processing Systems," IEEE Access, vol. 11, pp. 133681-133702, 2023.
[35] S. Ajmal, A. A. I. Ahmed, and C. Jalota, "Natural language processing in improving information retrieval and knowledge discovery in healthcare conversational agents," Journal of Artificial Intelligence and Machine Learning in Management, vol. 7, no. 1, pp. 34-47, 2023.
[36] A. Rogers, M. Gardner, and I. Augenstein, "Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension," ACM Computing Surveys, vol. 55, no. 10, pp. 1-45, 2023.
[37] S. Huo, Y. Xiang, H. Yu, M. Zhu, and Y. Gong, "Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Research," arXiv preprint arXiv:2402.16038, 2024.
[38] Y. Wu, Z. Jin, C. Shi, P. Liang, and T. Zhan, "Research on the Application of Deep Learning-based BERT Model in Sentiment Analysis," arXiv preprint arXiv:2403.08217, 2024.
[39] A. Bello, S.-C. Ng, and M.-F. Leung, "A BERT framework to sentiment analysis of tweets," Sensors, vol. 23, no. 1, p. 506, 2023.
[40] Z. Zhu and K. Mao, "Knowledge-based BERT word embedding fine-tuning for emotion recognition," Neurocomputing, vol. 552, p. 126488, 2023.
[41] X. Qin et al., "Bert-erc: Fine-tuning bert is enough for emotion recognition in conversation," in Proceedings of the AAAI Conference on Artificial Intelligence, 2023, vol. 37, no. 11, pp. 13492-13500.
[42] F. A. Acheampong, C. Wenyu, and H. Nunoo‐Mensah, "Text‐based emotion detection: Advances, challenges, and opportunities," Engineering Reports, vol. 2, no. 7, p. e12189, 2020.
[43] P. J. Stone, D. C. Dunphy, and M. S. Smith, "The general inquirer: A computer approach to content analysis," 1966.
[44] M. D. A. Ríos and A. Gravano, "Spanish dal: a spanish dictionary of affect in language," in Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, 2013, pp. 21-28.
[45] F. Å. Nielsen, "A new ANEW: Evaluation of a word list for sentiment analysis in microblogs," arXiv preprint arXiv:1103.2903, 2011.
[46] C. Musto, G. Semeraro, and M. Polignano, "A Comparison of Lexicon-based Approaches for Sentiment Analysis of Microblog Posts," in DART@ AI* IA, 2014: Citeseer, pp. 59-68.
[47] S. M. Mohammad and P. D. Turney, "Nrc emotion lexicon," National Research Council, Canada, vol. 2, p. 234, 2013.
[48] A. M. Yang, J. H. Lin, Y. M. Zhou, and J. Chen, "Research on building a Chinese sentiment lexicon based on SO-PMI," Applied Mechanics and Materials, vol. 263, pp. 1688-1693, 2013.
[49] S. S. Sharma and G. Dutta, "SentiDraw: Using star ratings of reviews to develop domain specific sentiment lexicon for polarity determination," Information Processing & Management, vol. 58, no. 1, p. 102412, 2021.
[50] K. Huang, H. Yu, J. Liu, W. Liu, J. Cao, and D. Huang, "Lexicon-based graph convolutional network for Chinese word segmentation," in Findings of the Association for Computational Linguistics: EMNLP 2021, 2021, pp. 2908-2917.
[51] F. Bravo-Marquez, A. Khanchandani, and B. Pfahringer, "Incremental word vectors for time-evolving sentiment lexicon induction," Cognitive Computation, pp. 1-17, 2022.
[52] S. Soumya and K. Pramod, "Sentiment analysis of malayalam tweets using machine learning techniques," ICT Express, vol. 6, no. 4, pp. 300-305, 2020.
[53] L. Wikarsa and S. N. Thahir, "A text mining application of emotion classifications of Twitter's users using Naive Bayes method," in 2015 1st International Conference on Wireless and Telematics (ICWT), 2015: IEEE, pp. 1-6.
[54] M. Singh, A. K. Jakhar, and S. Pandey, "Sentiment analysis on the impact of coronavirus in social life using the BERT model," Social Network Analysis and Mining, vol. 11, no. 1, p. 33, 2021.
[55] K. Dashtipour, M. Gogate, J. Li, F. Jiang, B. Kong, and A. Hussain, "A hybrid Persian sentiment analysis framework: Integrating dependency grammar based rules and deep neural networks," Neurocomputing, vol. 380, pp. 1-10, 2020.
[56] D. Paulsen, Y. Govind, and A. Doan, "Sparkly: A simple yet surprisingly strong TF/IDF blocker for entity matching," Proceedings of the VLDB Endowment, vol. 16, no. 6, pp. 1507-1519, 2023.
[57] A. R. Lubis, M. K. Nasution, O. S. Sitompul, and E. M. Zamzami, "The effect of the TF-IDF algorithm in times series in forecasting word on social media," Indones. J. Electr. Eng. Comput. Sci, vol. 22, no. 2, p. 976, 2021.
[58] A. Jalilifard, V. F. Caridá, A. F. Mansano, R. S. Cristo, and F. P. C. da Fonseca, "Semantic sensitive TF-IDF to determine word relevance in documents," in Advances in Computing and Network Communications: Proceedings of CoCoNet 2020, Volume 2, 2021: Springer, pp. 327-337.
[59] J. Wei and K. Zou, "Eda: Easy data augmentation techniques for boosting performance on text classification tasks," arXiv preprint arXiv:1901.11196, 2019.
[60] A. Karimi, L. Rossi, and A. Prati, "AEDA: an easier data augmentation technique for text classification," arXiv preprint arXiv:2108.13230, 2021.

全文公開日期 2027/08/20 (校外網路)
全文公開日期 2027/08/20 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文