簡易檢索 / 詳目顯示

研究生: 林俊達
Chun-Ta Lin
論文名稱: 基於雙向遞迴神經網路之辭彙預測能力研究
A Study of Bi-directional Recurrent Neural Network on Word Prediction Capability
指導教授: 林伯慎
Bor-Shen Lin
口試委員: 羅乃維
Nai-Wei Lo
古鴻炎
Hung-Yan Gu
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 50
中文關鍵詞: 遞迴神經網路長短期記憶單元序列到序列模型遺失詞預測N連語言模型
外文關鍵詞: recurrent nerual network, long-short term memory, sequence to sequence model, missing word prediction, N-gram language model
相關次數: 點閱:218下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本研究的主要目標是將遞迴神經網路(RNN)的文句預測模型,應用於文句中遺失詞的預測或推薦。傳統上詞彙預測較多使用N連語言模型,此種模型的弱點是無法使用到較長距離的詞彙相關性。在使用RNN文句預測模型進行正向詞彙預測的實驗中,我們發現除了句首無法正確預測外,其餘位置皆能精準地還原。再對反向輸入文句的模型進行訓練與測試,也發現其與正向訓練的模型有著對稱性,在句尾處較無法正確預測。因此,我們提出結合正向與反向的文句中詞彙預測方法,讓兩者能互相截長補短。
我們也進一步觀察此預測模型在遺失詞彙時的推薦能力。可以在使用者撰寫文章時的進行詞彙提示,或是對一個語意不通順的文句進行推薦或修改。在使用訓練文句進行測試時,我們發現此模型可以找到一些有相關性、且詞性合理的詞彙,當置於文句中仍可達到語意通順。最後,我們將此模型與雙連及三連語言模型進行預測的實驗比較,結果顯示,文句預測模型的預測結果優於雙連及三連語言模型。


The aim of this research is to predict the missing word in a sentence, using the sentence prediction model based on recurrent neural network. A common approach to this problem is N-gram language model, but the weakness of this model is that it can’t use the long-term word relationship in a sentence. In the self prediction experiment of our forward model, we find that except in the front of the sentence, other position in sentence can be predicted to itself very well. And we do the same experiment on backward model, it has the similar attribute with the forward model, Hence, we propose to combine the forward and backward model’s result, to balance the strengths and weakness between two models.
We also observe the ability of this model in application of missing word prediction. We can use this method in the article writing word prompt or refine a non-smoothing sentence. During the testing, we find that this model can predict some related and rationality words which can be put appropriately in the sentence. Finally, we compare to the bigram and trigram language model in the missing word prediction experiment, the result shows that our word prediction model is better than the bigram and trigram language model.

第一章 緒論1 1.1 背景介紹與研究動機1 1.2 論文目的與成果簡介2 1.3 論文組織與架構3 第二章 文獻與技術背景4 2.1 遞迴神經網路語言模型相關文獻4 2.1.1 遞迴神經網路4 2.1.2 長短期記憶單元6 2.1.3 遞迴神經網路語言模型8 2.2 序列到序列模型9 2.3 預測文句中遺失詞10 2.4 準確率評估指標11 2.5 本章摘要12 第三章 基於遞迴神經網路之文句預測13 3.1 文件前處理13 3.2 文句預測模型15 3.3 文句預測模型實驗17 3.3.1 蘋果日報新聞文件集介紹17 3.3.2 文句預測模型還原能力實驗17 3.3.3 詞彙在句中位置對正確率的影響20 3.3.4 正反向模型差異分析21 3.3.5 不同隱藏層節點數下模型還原力比較23 3.3.6 文句預測模型應用於遺失詞預測24 3.4 二連及三連語言模型與文句預測模型比較31 3.5 文句預測模型應用於特定領域35 3.6 本章摘要37 第4章 結論與未來方向38 參考文獻39

[1]J. L. Elman, "Finding structure in time," Cognitive Science, vol. 14, no. 2, pp. 179-211, 1990/04/01/ 1990.

[2]T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed representations of words and phrases and their compositionality," presented at the Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, Lake Tahoe, Nevada, 2013.

[3]P. J. Werbos, "Backpropagation through time: what it does and how to do it," Proceedings of the IEEE, vol. 78, no. 10, pp. 1550-1560, 1990.

[4]D. Britz(2015), Recurrent Neural Networks Tutorial:Part 3 – Backpropagation Through Time and Vanishing Gradients.

[5]S. Hochreiter, #252, and r. Schmidhuber, "Long Short-Term Memory," Neural Comput., vol. 9, no. 8, pp. 1735-1780, 1997.

[6]Olah. C.(2015), Understanding LSTM Networks (http://colah.github.io/posts/2015-08-Understanding-LSTMs/)

[7]Brown, Peter F.; Della Pietra, Vincent J.; deSouza, Peter V.; and Mercer, Robert L. (1990). "Class-based n-gram models of natural language." In Proceedings of the IBM Natural Language ITI. Paris, France, March 1990, 283-298. Also in Computational Linguistics 18(4), 1992, 467-479.

[8]B. Bidyut, A. Chaudhuri, "A simple real-word error detection and correction using local word bigram and trigram", Proceedings of the Twenty-Fifth Conference on Computational Linguistics and Speech Processing (ROCLING 2013).

[9]H. Faili. Detection and correction of real-word spelling errors in Persian language. In Proceedings of The International Conference on Natural Language Processing and Knowledge Engeneering (NLP-KE) 2010.

[10]Pu, Guan-Ying, Chen, Po-Lin, Wu, Shih-Hung, 以語言模型評估學習者文句修改前後之流暢度(Using language model to assess the fluency of learners sentences edited by teachers), Proceedings of the 28th Conference on Computational Linguistics and Speech Processing (ROCLING 2016), The Association for Computational Linguistics and Chinese Language Processing (ACLCLP), O16-1011

[11]T. Mikolov, M. Karafiát, L. Burget, J. Cernocký, and S. Khudanpur, Recurrent neural network based language model. 2010, pp. 1045-1048.

[12]T. Mikolov, S. Kombrink, L. Burget, J. Černocký, and S. Khudanpur, "Extensions of recurrent neural network language model," in 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011, pp. 5528-5531.

[13]K. Cho et al., "Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation," in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1724-1734: Association for Computational Linguistics.

[14]I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," presented at the Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, Montreal, Canada, 2014.

[15]Oshri, Barak and Nishith Khandwala. 2016. There and Back Again: Autoencoders for Textual Reconstruction.

[16]Billion Word Imputation Find and impute missing words in the billion word corpus, Kaggle (https://www.kaggle.com/c/billion-word-imputation)

[17]M. Held and R. M. Karp, "A dynamic programming approach to sequencing problems," presented at the Proceedings of the 1961 16th ACM national meeting, 1961.

[18]結巴斷詞 Fukuball (https://speakerdeck.com/fukuball/jieba-jie-ba-zhong-wen-duan-ci?slide=27)

QR CODE