簡易檢索 / 詳目顯示

研究生: 陳俊瑋
CHUN-WEI CHEN
論文名稱: 運用混合序列模型於中文語句修正之研究
Study of Applying Hybrid Sequential Model to Chinese Sentence Correction
指導教授: 呂政修
Jenq-Shiou Leu
陳維美
Wei-Mei Chen
口試委員: 林淵翔
Yuan-Hsiang Lin
林昌鴻
Chang Hong Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 49
中文關鍵詞: 中文語句修正遞歸神經網路
外文關鍵詞: Transformer, BERT
相關次數: 點閱:225下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來隨著中文成為世界上最多人使用的語言之一,研究中文自動 化語句修正的需求日益增加,這項研究除了可以被應用在中文語言學習 上,減少學習所需的花費以及回饋時間,也能讓文字工作者減少錯字發 生的機率。過去使用傳統的語句修正方法大多使用預先定義好的字典對 於語句裡的詞彙進行比較,較難對於語意上的錯誤進行修正。但隨著深 度學習的普及化,自動語句修正可以修正的錯誤種類也越來越多,透過 類神經網路自動學習語句的上下文含義,是有辦法對語意上的錯誤進行 修正的。然而實際應用上還有許待討論的部分,例如修正的準確度以及 修正所需的時間,或許目前尚不適合大規模的商業應用。Transformer 和 BERT 作為目前較流行的模型,雖然效能非常高,但是預測速度太 慢,因此,本文提出了一種可以應用在中文語句修正上的混合模型,透 過混合BERT和遞歸神經網路,在提升了預測速度的同時也保證了中文 語句修正的正確性。


    In recent years, as Chinese becoming one of the most popular languages in the world, the demand of automatic Chinese sentence correction gradually increases, the research can be adopted to Chinese language learning to reduce the cost of learning and feedback time, and also check the mistaken words for writers. The traditional way to do Chinese sentence correction may use a pre-defined vocabulary to check if the word exists in the vocabulary, but this kind of method cannot deal with the semantic error. As deep learning become popular now, an artificial neural network is possible to understand the context of sentence to correct semantic error, but there are still many issues need to be discussed. For example, the correctness and the time required to correct a sentence, so maybe it is still not the time to adopt deep learning based Chinese sentence correction system to the large-scale commercial application. Transformer and BERT as a popular model recently, known for its high performance and slow inference speed, we introduce a hybrid model which can be applied to Chinese sentence correction, combining BERT and recurrent neural network to improve the correctness and also the inference speed.

    論文摘要 I Abstract II 誌謝 III 目錄 IV 圖片索引 VII 表格索引 VIII 第 1 章 緒論 1 1.1 研究背景 1.2 研究動機 1.3 研究目標以及主要貢獻 第 2 章 背景知識與相關研究 2.1 中文語句修正介紹 2.2 序列至序列的轉換 Sequence To Sequence 2.3 基於遞歸神經網路的序列模型 2.3.1 傳統遞歸神經網路 Vanilla Recurrent Neural Network 2.3.2 長短期記憶網路 Long Short-Term Memory 2.3.3 閘控循環單元 Gated Recurrent Unit 2.4 Transformer 2.4.1 BERT 第 3 章 研究方法 3.1 語句修正系統架構 3.2 前處理 3.2.1 字典 Vocabulary 3.2.2 分詞器 Tokenizer 3.2.3 嵌入層 Embedding Layer 3.3 語言模型 Language Model 3.4 編碼器 (Encoder) 與解碼器 (Decoder) 的特性比較 3.4.1 編碼器特性 3.4.2 解碼器特性 3.5 分析以及改善方向 3.6 BERT-RNN 3.6.1 BERT-GRU 3.6.2 BERT-LSTM 3.7 訓練方法 3.7.1 最大似然估計 Maximum Likelihood Estimation 3.8 預測階段 3.8.1 貪婪解碼 Greedy Decoding 3.8.2 Beam Search 第 4 章 實驗 4.1 實驗設計 4.2 實驗環境 4.3 訓練參數 4.4 數據來源 4.5 評估方法 4.6 字典設定 4.6.1 來源字典 Source Vocabulary 4.6.2 目標字典 Target Vocabulary 4.7 實驗中使用的模型及其命名方式 4.8 25 字實驗 4.8.1 比較 RNN-Based 原始模型 4.8.2 比較 Transformer-Based 原始模型 4.8.3 比較混合模型與原始模型. 4.8.4 混合模型對於 RNN-Based 模型的提升 4.8.5 混合模型對於 Transformer-Based 模型的提升 4.9 50 字實驗 4.10 128 字實驗 4.11 三組實驗的比較 4.12 中文語句修正結果展示 第 5 章 結論 第 6 章 參考文獻

    [1] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
    [2] Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
    [3] Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
    [4] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104- 3112).
    [5] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
    [6] Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder- decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
    [7] Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics (pp. 311-318).
    [8] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
    [9] Loper, E., & Bird, S. (2002). NLTK: the natural language toolkit. arXiv preprint cs/0205028.
    [10] Zhao, Y., Jiang, N., Sun, W., & Wan, X. (2018, August). Overview of the nlpcc 2018 shared task: Grammatical error correction. In CCF International Conference on Natural Language Processing and Chinese Computing (pp. 439- 445). Springer, Cham.
    36
    [11] Chen, T., Li, M., Li, Y., Lin, M., Wang, N., Wang, M., ... & Zhang, Z. (2015). Mxnet: A flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274.
    [12] Guo, J., He, H., He, T., Lausen, L., Li, M., Lin, H., ... & Zhang, A. (2020). GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing. Journal of Machine Learning Research, 21(23), 1-7.
    [13] Ge, T., Zhang, X., Wei, F., & Zhou, M. (2019, July). Automatic grammatical error correction for sequence-to-sequence text generation: An empirical study. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (pp. 6059-6064).
    [14] Schmaltz, A., Kim, Y., Rush, A. M., & Shieber, S. M. (2016). Sentence-level grammatical error identification as sequence-to-sequence correction. arXiv preprint arXiv:1604.04677.
    [15] Li, S., Zhao, J., Shi, G., Tan, Y., Xu, H., Chen, G., ... & Lin, Z. (2019). Chinese grammatical error correction based on convolutional sequence to sequence model. IEEE Access, 7, 72905-72913.
    [16] Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.

    無法下載圖示 全文公開日期 2025/08/24 (校內網路)
    全文公開日期 2025/08/24 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE