簡易檢索 / 詳目顯示

研究生: 莊鈺翎
Yu-Ling Chuang
論文名稱: 運用元學習提升小樣本文本分類任務的效能
Improving the Performance of Few-Shot Text Classification using Meta-Learning
指導教授: 呂永和
Yung-Ho Leu
口試委員: 楊維寧
Wei-Ning Yang
陳雲岫
Yun-Shiow Chen
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 39
中文關鍵詞: 小樣本學習元學習文本分類語言模型
外文關鍵詞: Few-shot learning, Meta-learning, Text classification, Language model
相關次數: 點閱:164下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近幾年小樣本學習的研究不斷在增加,其目的為解決訓練資料不足的狀況。元學習(Meta-Learning)是小樣本學習的方法之一,透過各種不同的訓練任務上學習一組最佳參數,讓模型能夠快速適應到從未見過的任務上。在這篇研究中我們將元學習的其中一個方法Model-Agnostic Meta-Learning(MAML)應用在少樣本文本分類任務並結合了語言模型來做訓練。
    然而元學習也容易出現過度擬合訓練任務的狀況,導致模型在測試任務的泛化能力不足,另一方面,MAML本身的計算複雜度也讓模型在訓練時需要花費較多運算時間和成本。為了解決過度擬合和訓練效率的問題,我們使用了First-Order MAML 以及Reptile這兩種MAML的變形讓模型訓練過程較穩定並減少運算成本,並且為了提升meta-learning的泛化能力我們將正規化加入模型訓練中,最後實驗也證明能夠提升在少量資料測試任務上的正確率。


    In recent years, research on few-shot learning has been continuously increasing, aiming to address the issue of limited training data. Meta-Learning is one of the methods used in few-shot learning, where the model learns a set of optimal parameters across various training tasks, enabling it to quickly adapt to unseen tasks. In this study, we apply model-agnostic meta-learning (MAML) to text classification tasks and combine language model for training.
    However, meta-learning is also prone to overfitting on the training tasks, leading to poor generalization to the testing tasks. On the other hand, the computational complexity of MAML requires the model to spend more computation time and resources during training. To address the issues of overfitting and training efficiency, we employed two variations of MAML, namely First-Order MAML and Reptile, which offer more stable training processes and reduced computational costs. Additionally, to enhance the generalization ability of meta-learning, we applied regularization techniques to constrain the parameters for each training task. The experimental results demonstrated an improvement in accuracy on the testing tasks with limited data.

    摘要 I ABSTRACT II ACKNOWLEDGEMENT III TABLE OF CONTENTS IV LIST OF FIGURES VI LIST OF TABLES VII Chapter 1 Introduction 1 1.1 Research Background 1 1.2 Research Purpose 2 1.3 Research Method 3 1.4 Research Overview 4 Chapter 2 Related Work 5 2.1 Few-shot Learning 5 2.2 Meta-Learning 6 2.2.1 Model-Agnostic Meta-Learning (MAML) 7 2.2.2 First-Order Model-Agnostic Meta-Learning (FOMAML) 9 2.2.1 Reptile 10 2.3 Regularization 11 2.3.1 L2 Regularization 11 2.3.1 Dropout 11 Chapter 3 Techniques & Methods 13 3.1 Task Definition 13 3.2 Meta-learning framework 13 3.3 Regularization 15 Chapter 4 Experiments and Results 17 5.1 Datasets 17 5.2 Model 18 5.3 Experiments Setup 18 5.4 Performance Evaluation 19 5.5 Regularization 23 5.6 Hyperparameter Setting 24 Chapter 5 Conclusion and Future Work 26 5.1 Conclusion 26 5.2 Future Work 26 Reference 28

    [1] Lee et al. , “Meta Learning for Natural Language Processing: A Survey,” in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022, pp. 666–684.
    [2] H. Yao, L. Huang, L. Zhang, Y. Wei, L. Tian, J. Zou, J. Huang and Z. Li, “Improving Generalization in Meta-learning via Task Augmentation,” arXiv preprint arXiv:2007.13040,2020.
    [3] P. Sun, U. Ouyang, W. Zhang, X. Dai, “MEDA: Meta-Learning with Data Augmentation for Few-Shot Text Classification,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, pp. 3929-3935, 2021.
    [4] G. Denevi, M. Pontil, C. Ciliberto,“The Advantage of Conditional Meta-Learning for Biased Regularization and Fine Tuning,” in Proceedings of the 34th International Conference on Neural Information Processing Systems, pp.964-974,2020.
    [5] M. Yu et al. “Diverse Few-Shot Text Classification with Multiple Metrics,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp.1206-1215,2018.
    [6] H. Li, D. Eigen, S. Dodge, M. Zeiler, X. Wang ,“Finding Task-Relevant Features for Few-Shot Learning by Category Traversal,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-10,2019.
    [7] A. Zhao et al. ,“Domain-Adaptive Few-Shot Learning,” in 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 2021, pp. 1389-1398.
    [8] C. Finn, P. Abbeel, S. Levine, “Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks,” in Proceedings of the 34th International Conference on Machine Learning ,vol 70 2017, pp. 1126–1135, 2017.
    [9] A. Nichol, J. Achiam, J. Schulman, “On First-Order Meta-Learning Algorithm,” arXiv preprint arXiv:1803.02999 ,2018.
    [10] F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. S. Torr and T. M. Hospedales, “Learning to Compare: Relation Network for Few-Shot Learning,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1199-1208,2018.
    [11] Z. Lin, G. Thomas, G. Yang, T. Ma, “Model-based Adversarial Meta-Reinforcement Learning,” in Proceedings of the 34th International Conference on Neural Information Processing Systems , pp.10161-10173, 2020.
    [12] Z. Li, F. Zhou, F. Chen, H. Li, “Meta-SGD: Learning to Learn Quickly for Few-Shot Learning,” arXiv preprint arXiv:1707.09835,2017.
    [13] A. Rajeswaran, C. Finn, S. Kakade, S. Levine, “Meta-Learning with Implicit Gradients,” in Proceedings of the 33rd International Conference on Neural Information Processing Systems, vol.11, pp.113-124, 2019.
    [14] Y. Zhao et al. , “Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.583-595, 2022.
    [15] R. Geng et al. , “Induction Networks for Few-Shot Text Classification,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp.1087-1094,2020.
    [16] J. Snell, K. Swersky, Richard S. Zemel ,“Prototypical Networks for Few-shot Learning,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 4080–4090, 2017.
    [17] A. Antoniou, H. Edwards, A. Storkey, “How to train your MAML,” in ICLR 2019 Conference,2018.
    [18] Zi-Yi Dou et al. , “Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP),2019.

    無法下載圖示 全文公開日期 2025/07/21 (校內網路)
    全文公開日期 2025/07/21 (校外網路)
    全文公開日期 2025/07/21 (國家圖書館:臺灣博碩士論文系統)
    QR CODE