簡易檢索 / 詳目顯示

研究生: 鍾凱雯
Kai-wen Chung
論文名稱: 鑑別式訓練法效率改進之研究
A Study on Efficiency Improvement for Discriminative Training Approaches
指導教授: 林伯慎
Bor-shen Lin
口試委員: 古鴻炎
Hong-yan Gu
楊傳凱
Chuan-kai Yang
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 80
中文關鍵詞: 最大化相似度法鑑別式訓練法則最大化交互資訊法最小化音素錯誤法N-Bestbeam search
外文關鍵詞: discriminative training, maximum mutual information, maximum likelihood, minimum phone error, N-Best, beam search
相關次數: 點閱:204下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 鑑別式訓練法則已被證實可有效地提升語音辨識系統之正確率。有別於傳統的最大化相似度法,鑑別式訓練法為了增加鑑別力,故對競爭模型做反向訓練來拉開正確模型與競爭模型之距離。若對所有競爭模型皆做反向訓練,須訓練數量龐大的競爭模型,須耗費很大的時間與系統資源。過去的研究未考量如何選擇合適的競爭模型以提高鑑別式訓練法的效率。因此,本論文探討詞圖篩選方法來提升鑑別式訓練法的效率。我們針對最大化交互資訊法與最小化音素錯誤法進行實驗。結果顯示(1)減少N-best輸出的競爭路徑,可以減少83.7%的訓練時間,並不至於犧牲辨識率;(2)運用光束搜尋(beam search)來淘汰不具競爭力的路徑,除可減小詞圖以提升訓練效率,還可避免因過多的反向調整對模型的破壞,而使正確率有微幅上升;(3)透過設定不同的光束寬(beam width)來觀察競爭路徑的分佈,最小化音素錯誤法可以有效拉開正確模型與潛在競爭模型的距離,只留下機率較高的競爭路徑,而產生較小的詞圖。


    Discriminative training methods have been proven to achieve higher accuracies than the conventional maximum likelihood method for speech recognition system. It can increase the distance between the correct model and competing models by updating competing models in the opposite direction so as to improve the discriminative capability between models. Such training methods require a lot of time and space, which was not yet well discussed in early studies. This paper focuses on the issue of selecting more compact word graph in discriminative training through N-best selection and beam search in order to increase the training efficiency. Experimental results show that (1) Reducing the N-best paths can save 83.7% training time without sacrificing the recognition rate significantly; (2) Beam search can eliminate the less competitive nodes during search, avoid over-training of competing models and improve the accuracy slightly; (3) Minimum phone error (MPE) training can effectively increase the distance between the correct model and competing model and obtain more compact word graph.

    第一章 導論 1 1.1 研究動機 1 1.2 背景簡介 2 1.3 成果簡介 3 1.4 論文架構 4 第二章 背景知識 5 2.1 鑑別式訓練 5 2.2 最小化貝式風險法 6 2.3 最大化交互資訊法 8 2.4 最小化音素錯誤法 14 2.5 本章摘要 31 第三章 語料蒐集 32 3.1 語料介紹 32 3.2 語料前處理 33 3.3 本章摘要 36 第四章 鑑別式訓練的基礎實驗 37 4.1 最大化交互資訊法 37 4.2 最小化音素錯誤法 42 4.3 本章摘要 49 第五章 詞圖的篩選 50 5.1 競爭者的篩選 50 5.2 N-Best algorithm 篩選詞圖 52 5.3 Beam search 篩選詞圖 59 5.4 本章摘要 65 第六章 結論與未來研究方向 66 6.1 結論 66 6.2 未來研究方向 67 參考文獻 69

    [1] LAWERNCE R. RABINER, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition”, Proceeding of the IEEE, Vol. 77, No.2, pp.257-285, Feb (1989).
    [2] Jun Du, Peng Liu, Frank K. Soong, Jian-Lai Zhou, Ren-Hua Wang, “Minimum Divergence Based Discriminative Training”, INTERSPEECH ICSLP, pp.2410-2413, Sep (2006).
    [3] Goel, V.; Kumar, S.; Byrne, W., “Segmental Minimum Bayes-Risk Decoding for Automatic Speech Recognition”, IEEE transactions on Speech and Audio Processing, Vol. 12, pp.234-249, May (2004).
    [4] Alain Biem, “Minimum Classification Error Training for Online Handwriting Recognition”, IEEE transactions on Pattern Analysis and Machine Intelligence, Vol. 28, No.7, pp.1041-1051, Jul (2006).
    [5] Lalit R. Bahi, Peter F. Brown, Peter V. de Souza, Robert L. Mercer , “Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition”, Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP ’86, Vol. 11, pp.49-52, (1986).
    [6] V. Valtchev, J.J. Odell, P.C. Woodland, S.J. Young ,“MMIE Training of Large Vocabulary Recognition Systems”, SPEECH COMMUNICATION ,Vol. 22, No.4, pp.303-314, Sep (1997).
    [7] Daniel Povey , “Discriminative Training for Large Vocabulary Speech Recognition”, Dissertation submitted to the University of Cambridge for the degree of Doctor of Philosophy, March, (2003).
    [8] Povey, D.; Woodland, P.C. ,“Minimum Phone Error and I-smoothing for Improved Discriminative Training ”, Acoustics, Speech, and Signal Processing, IEEE, Vol. 2, pp.I-105 - I-108, (2002).
    [9] Shih-Hung Liu, Fang-Hui Chu, Shih-Hsiang Lin, Hung-Shin Lee, Berlin Chen “Training Data Selection for Improving Discriminative Training of Acoustic Models”, Automatic Speech Recognition & Understanding, ASRU. IEEE, pp.284-289, (2007).
    [10] Biing-Hwang Juang, Wu Chou, Chin-Hui Lee, “Minimum Classification Error Rate Methods for Speech Recognition”, Speech and Audio Processing, IEEE, Vol. 5, No.3, May (1997).
    [11] Hong-Kwang, JeffKuo, Brian Kingsbury, Geoffrey Zweig, “Discriminative Training of Decoding Graphs for Large Vocabulary Continuous Speech Recognition” Acoustics, Speech and Signal Processing, IEEE Transactions on, ICASSP, Vol. 4, pp.IV-45 – IV-48, (2007).
    [12] Erik McDermott, Timothy J. Hazen, Jonathan Le Roux, Atsushi Nakamura, Shigeru Katagiri, “Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error” Audio, Speech, and Language Processing, IEEE Transactions on. Vol. 15, NO. 1, pp. 203-223, (2007).
    [13] 李琳山,陳佳妤,「最小音素錯誤模型及特徵訓練法於中文大詞彙辨識上之初步研究」,碩士論文,國立台灣大學電信工程研究所,台北 (2006).
    [14] 張智星,許碩斌,「最小音素錯誤鑑別式訓練法則應用於連續音素辨識系統之初步研究」,碩士論文,國立清華大學資訊工程研究所,新竹 (2006).

    [15] 莊堯棠, 黃夢晨,「最小錯誤鑑別式應用於語者辨識之競爭語者探討」碩士論文,國立中央大學電機工程研究所,桃園 (2008).
    [16] Steve Young, Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Xunying (Andrew) Liu, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, Phil Woodland, “The HTK Book (for HTK Version 3.4)” (2009).
    [17] P.S. Gopalakrishnan, D. Kanevsky, A. Nadas, and D. Nahamoo "Generalization of the Baum algorithm to rational objective functions" , Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP-89, pp. 631-634.,(1989)

    QR CODE