數據增強方法應用於多重面向情感分析｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	蔡旭真 Hsu-Jeng Tsai
論文名稱：	數據增強方法應用於多重面向情感分析 Aspect-Based Sentiment Analysis Using Data Augmentation Approach
指導教授：	呂永和 Yung-Ho Leu
口試委員:	楊維寧 Wei-Ning Yang 陳雲岫 Yun-Shiow Chen
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理系 Department of Information Management
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	42
中文關鍵詞：	數據增強、多重面向情感分析、預訓練語言模型
外文關鍵詞：	Data Augmentation, Aspect-Based Sentiment Analysis, Pre-trained Language Model
相關次數：	點閱：239 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，挖掘用戶意見逐漸成為一個重要的研究領域，因為它在現實世界中得到了廣泛應用。意見挖掘也可被稱為情感分析，而多重面向情感分析（ABSA）是一種細粒度的情感分析。 ABSA 旨在區分給定方面術語的情感極性（正面、負面、中性），此任務通常用於分析在線平台上的評論。最近，許多研究人員使用了不同的方法來解決 ABSA 任務，例如神經網絡、圖神經網絡、注意力機制和Transformer。然而，這些研究忽略了與 NLP 中的其他任務的數據集相比，ABSA 任務的數據集大小是有限的。為了解決這個問題，在論文中，我們應用了數據增強技術，有效地增加了 ABSA 任務數據集的數量和多樣性，也在兩個基準數據集上的實驗結果表明，我們提高了基線模型的性能，並且分析了數據增強如何影響正則化以及訓練數據大小如何影響準確性。

Mining opinions from users has become a growing domain of research because it is widely used in the real world. The research field is known as opinion mining or sentiment analysis. Aspect-based sentiment analysis (ABSA) is a type of fine-grained sentiment analysis. ABSA aims to distinguish the sentiment polarity (positive, negative, neutral) of the given aspect term. This task is generally used to analyze reviews on online platforms. Recently, many researchers have used different methods to solve ABSA tasks, such as neural networks, graph neural networks, attention mechanism, and transformers. However, these studies ignore that the dataset size in ABSA is limited compared with the other dataset in NLP. To tackle the problem, we apply the data augmentation techniques, which effectively increase the amount and diversity of the dataset for the ABSA task.
The experimental result on the two benchmark datasets demonstrates that we improve the baseline model performance. We also analyze how the data augmentation affects the regularization and how the training data sizes affect the accuracy.

ABSTRACT i
ACKNOWLEDGEMENT ii
TABLE OF CONTENTS iii
LIST OF FIGURES v
LIST OF TABLES vi
Chapter 1 Introduction 1
1.1 RESEARCH BACKGROUND 1
1.2 RESEARCH PURPOSE 2
1.3 RESEARCH METHOD 3
1.4 RESEARCH OVERVIEW 3
Chapter 2 Related Work 5
2.1 DATA AUGMENTATION TECHNIQUES 5
2.1.1 Lexical Substitution 5
2.1.2 Back Translation 6
2.1.3 Noise-based Injection 6
2.1.4 Mixup 6
2.2 ASPECT-BASED SENTIMENT ANALYSIS MODELS 8
2.2.1 Long Short-Term Memory 8
2.2.2 Attention Mechanism 9
2.2.3 Transformer 10
2.2.4 Graph Convolutional Networks 11
Chapter 3 Techniques & Methods 13
3.1 TASK DEFINITION 13
3.2 DATA AUGMENTATION (DA) 13
3.2.1 Easy Data Augmentation (EDA) in ABSA 14
3.2.2 An Easier Data Augmentation for ABSA 15
3.2.3 Mixup for ABSA 15
3.3 BERT 16
3.3.1 BERT Embeddings 17
3.3.2 BERT Encoder 17
3.4 CLASSIFIER 18
Chapter 4 Experiments 19
4.1 DATASETS 19
4.2 EXPERIMENT SETTINGS AND EVALUATION METRICS 19
4.3 BASELINE 19
4.4 RESULTS 20
4.5 EFFECT OF REGULARIZATION 21
4.6 SIZE OF TRAINING DATA 22
4.7 CASE STUDY 24
Chapter 5 Conclusion and Future Work 25
APPENDIX 26
A. THE SUMMARY FOR THE EFFECT OF REGULARIZATION 26
B. THE SUMMARY FOR SIZE OF TRAINING DATA 29
Reference 31

                                

[1] Shorten, C., Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J Big Data 6, 60. 2019.
[2] Wang, J., & Perez, L. The Effectiveness of Data Augmentation in Image Classification using Deep Learning. ArXiv, 2017. abs/1712.04621
[3] Zhang, X., Zhao, J., & LeCun,Y. Character-level Convolutional Networks for Text Classification. Neural Information Processing Systems(NIPS), 2015.
[4] Wei, J., & Zou, K. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. EMNLP-IJCNLP, 2019. 6381–6387
[5] Zhang, H., Cisse, M., Dauphin, Y., & Lopez-Paz, D. mixup: BEYOND EMPIRICAL RISK MINIMIZATION. ICLR, 2018. abs/1710.09412.
[6] Guo, H., & Mao, Y. Augmenting Data with Mixup for Sentence Classification: An Empirical Study. ArXiv, 2019. abs/1905.08941.
[7] Chen, J., Yang, Z., & Yang, D. MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification. ACL, 2019. abs/2004.12239.
[8] Tang, D., Qin, B., Feng, X., & Liu, T. Effective LSTMs for Target-Dependent Sentiment Classification. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2016. 3298–3307.
[9] Hochreiter, S., & Schmidhuber, J. LONG SHORT-TERM MEMORY. Neural Computation 9(8):1735-1780, 1997.
[10] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. Attention is all you need. Neural Information Processing Systems (NIPS), 2017, pp. 5998–6008.
[11] Wang, Y., Huang, M., Zhi, X., & Zhao, L. Attention-based LSTM for Aspect-level Sentiment Classification. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016. 606–615.
[12] Ma, D., Le, S., Zhang, X., & Wang, H. Interactive Attention Networks for Aspect-Level Sentiment Classification. IJCAI, 2017.
[13] Devlin, J., Chang, M., Lee, K., & Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. pages 4171–4186.
[14] Tom B. Brown, et al. Language Models are Few-Shot Learners. NeurIPS, 2020. abs/2005.14165.
[15] Song, Y., Wang, J., Jiang, T., Liu, T., & Rao, Y. Attentional Encoder Network for Targeted Sentiment Classification. ArXiv, 2019. abs/1902.09314.
[16] Hu Xu, Bing Liu, Lei Shu, and Philip Yu. 2019. BERT Post-Training for Review Reading Comprehension and Aspect-based Sentiment Analysis. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019. pages 2324–2335.
[17] Karimi, A., Rossi, L., & Prati, A. Adversarial Training for Aspect-Based Sentiment Analysis with BERT. 2020 25th International Conference on Pattern Recognition (ICPR), 2021. pp. 8797-8803. doi: 10.1109/ICPR48806.2021.9412167.
[18] Karimi, A., Rossi, L., & Prati. A. AEDA: An Easier Data Augmentation Technique for Text Classification. In Findings of the Association for Computational Linguistics: EMNLP. 2021, pages 2748–2754, Punta Cana, Dominican Republic. Association for Computational Linguistics.
[19] Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., & Manandhar. S. SemEval-2014 Task 4: Aspect Based Sentiment Analysis. In Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), 2014. pages 27–35. Dublin, Ireland. Association for Computational Linguistics.
[20] Xie, Q., Dai, Z., Hovy, E., Luong, M., & Le, Q. Unsupervised Data Augmentation for Consistency Training. NeurIPS, 2020. abs/1904.12848.
[21] Thomas N. Kipf & Max Welling Semi-Supervised Classification with Graph Convolutional Networks. ICLR, 2017. abs/1609.02907.
[22] Sun, K., Zhang, R., Mensah, S., Mao, Y., & Liu. X. Aspect-Level Sentiment Analysis Via Convolution over Dependency Tree. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019. pages 5679–5688, Hong Kong, China. Association for Computational Linguistics.
[23] Zhang, C., Li, Q., & Song. D. Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019. pages 4568–4578, Hong Kong, China. Association for Computational Linguistics.
[24] Zhao, P., Hou, L., & Wu, O. Modeling Sentiment Dependencies with Graph Convolutional Networks for Aspect-level Sentiment Classification. ArXiv, 2019. abs/1906.04501.
[25] Li, B., Hou, Y., & Che, W. Data augmentation approaches in natural language processing: A survey. AI Open, 2022.
[26] Liesting, T., Frasincar, F., & Truşcă, M. Data augmentation in a hybrid approach for aspect-based sentiment analysis. In Proceedings of the 36th Annual ACM Symposium on Applied Computing (SAC '21). 2021. 828–835.

全文公開日期 2024/07/25 (校內網路)
全文公開日期 2024/07/25 (校外網路)
全文公開日期 2024/07/25 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文