簡易檢索 / 詳目顯示

研究生: 洪政緯
Zheng-Wei Hong
論文名稱: 基於遷移學習之PV-DM模型
PV-DM model based on transfer learning
指導教授: 楊維寧
Wei-Ning Yang
口試委員: 楊維寧
陳雲岫
呂永和
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 19
中文關鍵詞: 自然語言處理遷移學習
外文關鍵詞: PV-DM, Paragraph Vector, doc2vec, sentence embedding
相關次數: 點閱:139下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著近年機器學習的發展,自然語言處理成為了研究的重點。為了使電腦了解人類使用的語言,我們必須將文字轉換成電腦能運算的資料。
    PV-DM是一種讓文本轉換為特徵向量的一種方式。PV-DM訓練文本特徵向量時,也需要訓練詞向量,對於只想要得到文本特徵向量的人,這種行為是多餘的,並且訓練PV-DM模型往往需要大量的資料集才能得到較好的結果。
    因此本研究的目的在於提出一種方法藉由遷移Fasttext預訓練的詞向量模型,將文本轉換為特徵向量。此種方法與PV-DM模型的方法相似,因此探討此種方法與PV-DM模型相比是否能提升特徵的表現。先將資料分為訓練與測試兩個部分,並透過此方法將其轉換為特徵向量後,再以Logistic regression進行二元分類,並得出測試資料集的Accuracy與AUC,並以此評估特徵向量的表現。
    實驗結果顯示,在IMDB與YELP兩種不同的資料集的情況下,本方法所轉換之句子特徵向量表現優於原本的PV-DM模型。


    With the rapid growth of Machine Learning, Natural Language Processing (NLP) has become an important research topic, which transforms texts into computable data so that computers could understand the human language.
    One of an important method in NLP is Distributed Memory Model of Paragraph Vectors (PV-DM), which can transform paragraphs or documents into the feature representation. However, PV-DM needs to train both word vectors and document vectors, which may increase the requirement in mass training data. It is unnecessary for those who only need document vectors.
    Therefore, this study proposed a method to transform paragraphs into the feature representation based on transferring Fasttext pre-trained word vectors. The proposed method is similar to PV-DM, so the performance of both would be compared and evaluated for their accuracy and AUC. These indicators were generated from transforming documents into feature representation respectively and then utilizing the binary classification by Logistic Regression.
    The result of the experiment showed that the proposed method outperforms PV-DM in different training data size.

    摘要 i ABSTRACT iv 致謝 v 目錄 vi 圖目錄 vii 第1章 緒論 1 第2章 文獻探討與背景知識 2 2.1 Distributed Memory Model of Paragraph Vectors (PV-DM) 2 2.2 Fasttext 3 2.3 遷移學習(Transfer learning) 4 第3章 研究方法 5 第4章 實驗設計 6 4.1 資料集簡介 6 4.2 實驗目的 7 4.3 實驗步驟 7 第5章 實驗結果與分析 8 5.1 IMDB 8 5.2 YELP 9 第6章 結論與後續工作 10 參考文獻 11

    [1] Le, Q. V. & Mikolov, T. (2014). Distributed Representations of Sentences and Documents.. ICML (p./pp. 1188--1196), .

    [2] Pan, S. & Yang, Q. (2010). A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22, 1345--1359.

    [3] Bojanowski, P., Grave, E., Joulin, A. & Mikolov, T. (2017). Enriching Word Vectors with Subword Information. Transactions of the Association for Computational Linguistics, 5, 135--146.

    [4] Mikolov, Tomas, Sutskever, Ilya, Chen, Kai, Corrado, Greg S and Dean, Jeff. "Distributed Representations of Words and Phrases and their Compositionality." In Advances in Neural Information Processing Systems 26 , edited by C.J.C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K.Q. Weinberger , 3111--3119. : , 2013.

    [5] Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C. & Joulin, A. (2017). Advances in Pre-Training Distributed Word Representations. Proceedings of the International Conference on Language Resources and Evaluation (LREC 2018), .

    [6] Maas, A. L., Daly, R. E., Pham, P. T., Huang, D., Ng, A. Y. & Potts, C. (2011). Learning word vectors for sentiment analysis. Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1 (p./pp. 142--150), .

    [7] Zhang, X., Zhao, J. & LeCun, Y. (2015). Character-level convolutional networks for text classification. Advances in neural information processing systems (p./pp. 649--657), .

    QR CODE