簡易檢索 / 詳目顯示

研究生: 黃冠達
Kuan-da Huang
論文名稱: 應用支撐向量機於中文關鍵詞驗證之研究
Keyword Verification for Mandarin Chinese Based on Support Vector Machines
指導教授: 林伯慎
Bor-shen Lin
口試委員: 王新民
Hsin-min Wang
古鴻炎
Hung-yan Gu
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2007
畢業學年度: 95
語文別: 中文
論文頁數: 72
中文關鍵詞: 支撐向量機聲韻母驗證關鍵詞擷取關鍵詞驗證
外文關鍵詞: Support Vector Machines, Initial-Final Verification, Keyword Spotting, Keyword Verification
相關次數: 點閱:310下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

由於辨識錯誤是語音辨識系統無法避免的問題,關鍵詞驗證技術成為語音辨識應用上必要的一環。而支撐向量機經由決策邊界的最佳化,在模型辨識上已被證實是具有良好效能的分類器。本論文的研究是先用支撐向量機做聲韻母驗證,並進一步利用聲韻母分類器的輸出,對關鍵詞進行驗證。這樣的做法不需要對個別關鍵詞去訓練反關鍵詞模型或分類器,可以使關鍵詞擷取系統具有較佳的擴充性,並易於移植到不同的應用領域。論文中我們比較了聲學分數、時長、反模型距離、參考路徑距離等驗證特徵的鑑別力,並藉由結合不同的輸入特徵來提升驗證效能。進一步,我們比較了不同的核心函數,並對其支撐向量機參數進行最佳化。實驗結果顯示,在關鍵詞驗證上,支撐向量機使用線性核心函數時相等錯誤率可達到23.61%,比事先計算最佳化門檻值的38.61%和類神經網路的26.77% 均為佳;若再加上詞長相關的門檻值,可以讓相等錯誤率進一步的下降到19.33%。


Due to recognition error is an avoidless problem in the speech recognition, keyword spotting becomes a import role. Support Vector Machines via a decision boundary optimization, has proved is an excellent classifier in the model recognition. In our research, we first use Support Vector Machines in initial-final’s verification, and then do verification in keyword level with combination of phone level classifiers output. Because we don’t training keyword models or keyword classifiers, so our keyword verification system has a better portability and can easily change application domain.
In the paper, we propose four verification features, acoustic score, duration, anti-model distance and reference path distance. We first compare feature’s discrimination, and then combine them to get a better performance. We also compare kernel function’s performance, and adjust optimal parameters to Support Vector Machines. Experimental results show that using Support Vector Machines with linear kernel function, equal error rate is 23.61%, better than reference verification system 38.61% and Neural Network classifier 26.77%. Finally we use length dependent threshold can further decrease equal error rate to 19.33%.

第一章 導論 1.1 研究動機 1 1.2 背景簡介 2 1.3 論文目的與成果簡介 3 1.4 論文組織與架構 4 第二章 關鍵詞擷取技術 2.1 語音前處理與特徵參數之擷取 5 2.2 聲學模型的訓練與辨識 6 2.3 關鍵詞辨識技術 11 2.4 關鍵詞驗證相關技術 13 2.5 本章摘要 18 第三章 支撐向量機 3.1 簡介 19 3.2 支撐向量機 20 3.3 不可分割的情況 25 3.4 非線性支撐向量機 27 3.5 支撐向量機工具 30 3.6 本章摘要 32 第四章 應用支撐向量機於聲韻母驗證 4.1 簡介 33 4.2 分類特徵擷取 33 4.3 訓練資料產生 40 4.4 聲韻母驗證實驗 43 4.5 本章摘要 56 第五章 應用支撐向量機於關鍵詞驗證 5.1 對照驗證方法 58 5.2 應用支撐向量機的關鍵詞驗證方法 60 5.3 實驗結果 63 5.4 語音長度分類之關鍵詞驗證 67 5.5 本章摘要 70 第六章 結論與未來研究方向 6.1 結論 71 6.2 未來研究方向 72 附錄 (一) 關鍵詞一覽表 73 (二) 命令句一覽表 76 參考文獻………………………………………………………………..77

[1] LAWRENCE R. RABINER,“A Tutorial on Hidden Markov Model and Selected Applications in Speech Recognition”, Proceedings of the IEEE, Vol. 77, No.2, pp. 257-286, Feb 1989
[2] Juang B.H., Wu Chou, Lee Chin-Hui” Minimum classification error rate methods for speech recognition”. IEEE Trans. Speech Audio process. 5(3) pp. 257-265, May 1997.
[3] Mazin G. Rahim, Lee Chin-Hui, Juang Biing-Hwang, ”Discriminative utterance verification for connected digits recognition”. IEEE Trans. SAP, vol. 5, pp. 266–277, 1997.
[4] Woodland P.C., Povey D., ”Large Scale Discriminative Training of Hidden Markov Models for Speech Recognition,” Computer Speech and Language, vol.16, pp.25-47, 2002.
[5] Wessel F., Schluter R., Macherey K., Ney H., “Confidence measures for large vocabulary continuous speech recognition”, IEEE Trans. Speech Audio Process. 9 (3), pp. 288–298, 2001.
[6] Xin L. and Wang B.X., “Utterance verification for spontaneous mandarin speech keyword spotting”, IEEE Proceedings Info-tech and Info-net, 2001 Proceedings, ICII 2001 Beijing, vol.3. pp. 397-401, 2001.
[7] Junkawitsch et al., "A new keyword spotting algorithm with pre-calculated optimal thresholds", Proc. Intern. Conference on Speech and Language Processing , pp. 2067-2070, 1996.
[8] Jiang H., Lee C.-H. ,”A new approach to utterance verification based on neighborhood information in model space.”IEEE Trans. Speech Audio Process. 11 (5), pp. 425–434, 2003.
[9] Tae-Yoon Kim, Member, IEEE, and Hanseok Ko, Senior Member ,”Bayesian Fusion of Confidence Measures for Speech Recognition”, IEEE SIGNAL PROCESSING LETTERS, VOL. 12, NO. 12, Dec 2005.
[10] Jiang H.,“Confidence Measures for Speech Recognition : A Survey”, Speech communication 45 , pp. 455-570 , 2005.
[11] Burges C.J.C.,”A Tutorial on Support Vector Machines for Pattern Recognition”, AT&T Bell Labs, Nov 1999.
http://svm.research.bell-labs.com/SVMdoc.html
[12] Ganapathiraju A., Hamaker J. and Picone J.,“Hybrid HMM/SVM Architectures for Speech Recognition”, Proceedings of the Department of Defense Hub 5 Workshop, College Park, Maryland, USA, May 2000.

[13] Gurban M. and Thiran J.,“Audio-visual speech recognition with a hybrid SVM-HMM system”, 13th European Signal Processing Conference, Antalya, Turkey, Sep 2005.
[14] Philip C., Moreno P.,“On the Use of Support Vector Machines for Phonetic Classification”, Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Phoenix, Arizona, USA, 1999.
[15] A. Ganapathiraju, et. al., ”Support Vector Machines for Speech Recognition”, Proc. of the ICSLP, pp. 2923-2926, Sydney, Australia, Nov 1998.
[16] 王駿發,林博川,王家慶,宋豪靜,”以支援向量機為基礎之新穎語者切換偵測演算法”, ROCLING 2005.
[17] Lothar Hemes and Joachim M. Buhmann, ”Feature Selection for Support Vector Machines”.
[18] BenAyed Y., Fohr D., Haton J.-P., Chollet G.,
”Improving the Performance of a Keyword Spotting System by Using Support Vector Machines”, in: IEEE Automatic Speech Recognition and Understanding Workshop - ASRU, St, Thomas, U.S. Virgin islands, Dec 2003.
[19] Rose R., ”Confidence measures for the Switchboard database”, Proc. of International Conference on Acoustics, Speech and Signal Processing, pp. 511–514 , 1996.
[20] Huang Shilei,XIE Xiang,KUANG Jingming, ”Improving the Performance of Out-of-vocabulary Word Rejection by Using Support Vector Machines”, INTERSPEECH -ICSLP ,17-21, pp. 1618-1621, Sep 2006.
[21] Fei Sha and Lawrence Saul K.,“LARGE MARGIN GAUSSIAN MIXTURE MODELING FOR PHONETIC CLASSIFICATION AND RECOGNITION”, ICASSP , pp. I265-I269, 2006.
[22] Bin Ma, Donglai Zhu, Rong Tong, Haizhou Li,“ Speaker Cluster Based GMM Tokenization for Speaker Recognition” , INTERSPEECH 2006-ICSLP ,17-21, pp. 505-508, Sep 2006.
[23] SERGIOS THEODORIDIS, KONSTANTIONS KOUTROUMBAS, ”Pattern Recognition”(Second Edition) ,全華科技圖書.
[24] Xuedong Huang, Alex Acero,Hsiao-Wuen Hon ,”Spoken Language Processing”, International Editions.
[25] Vojislav Kecman ,”Learning and Soft Computing”,全華科技圖書.
[26] http://svmlight.joachims.org/.
[27] Vojislav Kecman, Michael Vogt, Te Ming Huang ,”On the equality of kernel AdaTron and sequential minimal optimization in classification and regression tasks and alike algorithms for kernel machines. pp. 215-222.

QR CODE