研究生: |
黃冠達 Kuan-da Huang |
---|---|
論文名稱: |
應用支撐向量機於中文關鍵詞驗證之研究 Keyword Verification for Mandarin Chinese Based on Support Vector Machines |
指導教授: |
林伯慎
Bor-shen Lin |
口試委員: |
王新民
Hsin-min Wang 古鴻炎 Hung-yan Gu |
學位類別: |
碩士 Master |
系所名稱: |
管理學院 - 資訊管理系 Department of Information Management |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 72 |
中文關鍵詞: | 支撐向量機 、聲韻母驗證 、關鍵詞擷取 、關鍵詞驗證 |
外文關鍵詞: | Support Vector Machines, Initial-Final Verification, Keyword Spotting, Keyword Verification |
相關次數: | 點閱:389 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
由於辨識錯誤是語音辨識系統無法避免的問題,關鍵詞驗證技術成為語音辨識應用上必要的一環。而支撐向量機經由決策邊界的最佳化,在模型辨識上已被證實是具有良好效能的分類器。本論文的研究是先用支撐向量機做聲韻母驗證,並進一步利用聲韻母分類器的輸出,對關鍵詞進行驗證。這樣的做法不需要對個別關鍵詞去訓練反關鍵詞模型或分類器,可以使關鍵詞擷取系統具有較佳的擴充性,並易於移植到不同的應用領域。論文中我們比較了聲學分數、時長、反模型距離、參考路徑距離等驗證特徵的鑑別力,並藉由結合不同的輸入特徵來提升驗證效能。進一步,我們比較了不同的核心函數,並對其支撐向量機參數進行最佳化。實驗結果顯示,在關鍵詞驗證上,支撐向量機使用線性核心函數時相等錯誤率可達到23.61%,比事先計算最佳化門檻值的38.61%和類神經網路的26.77% 均為佳;若再加上詞長相關的門檻值,可以讓相等錯誤率進一步的下降到19.33%。
Due to recognition error is an avoidless problem in the speech recognition, keyword spotting becomes a import role. Support Vector Machines via a decision boundary optimization, has proved is an excellent classifier in the model recognition. In our research, we first use Support Vector Machines in initial-final’s verification, and then do verification in keyword level with combination of phone level classifiers output. Because we don’t training keyword models or keyword classifiers, so our keyword verification system has a better portability and can easily change application domain.
In the paper, we propose four verification features, acoustic score, duration, anti-model distance and reference path distance. We first compare feature’s discrimination, and then combine them to get a better performance. We also compare kernel function’s performance, and adjust optimal parameters to Support Vector Machines. Experimental results show that using Support Vector Machines with linear kernel function, equal error rate is 23.61%, better than reference verification system 38.61% and Neural Network classifier 26.77%. Finally we use length dependent threshold can further decrease equal error rate to 19.33%.
[1] LAWRENCE R. RABINER,“A Tutorial on Hidden Markov Model and Selected Applications in Speech Recognition”, Proceedings of the IEEE, Vol. 77, No.2, pp. 257-286, Feb 1989
[2] Juang B.H., Wu Chou, Lee Chin-Hui” Minimum classification error rate methods for speech recognition”. IEEE Trans. Speech Audio process. 5(3) pp. 257-265, May 1997.
[3] Mazin G. Rahim, Lee Chin-Hui, Juang Biing-Hwang, ”Discriminative utterance verification for connected digits recognition”. IEEE Trans. SAP, vol. 5, pp. 266–277, 1997.
[4] Woodland P.C., Povey D., ”Large Scale Discriminative Training of Hidden Markov Models for Speech Recognition,” Computer Speech and Language, vol.16, pp.25-47, 2002.
[5] Wessel F., Schluter R., Macherey K., Ney H., “Confidence measures for large vocabulary continuous speech recognition”, IEEE Trans. Speech Audio Process. 9 (3), pp. 288–298, 2001.
[6] Xin L. and Wang B.X., “Utterance verification for spontaneous mandarin speech keyword spotting”, IEEE Proceedings Info-tech and Info-net, 2001 Proceedings, ICII 2001 Beijing, vol.3. pp. 397-401, 2001.
[7] Junkawitsch et al., "A new keyword spotting algorithm with pre-calculated optimal thresholds", Proc. Intern. Conference on Speech and Language Processing , pp. 2067-2070, 1996.
[8] Jiang H., Lee C.-H. ,”A new approach to utterance verification based on neighborhood information in model space.”IEEE Trans. Speech Audio Process. 11 (5), pp. 425–434, 2003.
[9] Tae-Yoon Kim, Member, IEEE, and Hanseok Ko, Senior Member ,”Bayesian Fusion of Confidence Measures for Speech Recognition”, IEEE SIGNAL PROCESSING LETTERS, VOL. 12, NO. 12, Dec 2005.
[10] Jiang H.,“Confidence Measures for Speech Recognition : A Survey”, Speech communication 45 , pp. 455-570 , 2005.
[11] Burges C.J.C.,”A Tutorial on Support Vector Machines for Pattern Recognition”, AT&T Bell Labs, Nov 1999.
http://svm.research.bell-labs.com/SVMdoc.html
[12] Ganapathiraju A., Hamaker J. and Picone J.,“Hybrid HMM/SVM Architectures for Speech Recognition”, Proceedings of the Department of Defense Hub 5 Workshop, College Park, Maryland, USA, May 2000.
[13] Gurban M. and Thiran J.,“Audio-visual speech recognition with a hybrid SVM-HMM system”, 13th European Signal Processing Conference, Antalya, Turkey, Sep 2005.
[14] Philip C., Moreno P.,“On the Use of Support Vector Machines for Phonetic Classification”, Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Phoenix, Arizona, USA, 1999.
[15] A. Ganapathiraju, et. al., ”Support Vector Machines for Speech Recognition”, Proc. of the ICSLP, pp. 2923-2926, Sydney, Australia, Nov 1998.
[16] 王駿發,林博川,王家慶,宋豪靜,”以支援向量機為基礎之新穎語者切換偵測演算法”, ROCLING 2005.
[17] Lothar Hemes and Joachim M. Buhmann, ”Feature Selection for Support Vector Machines”.
[18] BenAyed Y., Fohr D., Haton J.-P., Chollet G.,
”Improving the Performance of a Keyword Spotting System by Using Support Vector Machines”, in: IEEE Automatic Speech Recognition and Understanding Workshop - ASRU, St, Thomas, U.S. Virgin islands, Dec 2003.
[19] Rose R., ”Confidence measures for the Switchboard database”, Proc. of International Conference on Acoustics, Speech and Signal Processing, pp. 511–514 , 1996.
[20] Huang Shilei,XIE Xiang,KUANG Jingming, ”Improving the Performance of Out-of-vocabulary Word Rejection by Using Support Vector Machines”, INTERSPEECH -ICSLP ,17-21, pp. 1618-1621, Sep 2006.
[21] Fei Sha and Lawrence Saul K.,“LARGE MARGIN GAUSSIAN MIXTURE MODELING FOR PHONETIC CLASSIFICATION AND RECOGNITION”, ICASSP , pp. I265-I269, 2006.
[22] Bin Ma, Donglai Zhu, Rong Tong, Haizhou Li,“ Speaker Cluster Based GMM Tokenization for Speaker Recognition” , INTERSPEECH 2006-ICSLP ,17-21, pp. 505-508, Sep 2006.
[23] SERGIOS THEODORIDIS, KONSTANTIONS KOUTROUMBAS, ”Pattern Recognition”(Second Edition) ,全華科技圖書.
[24] Xuedong Huang, Alex Acero,Hsiao-Wuen Hon ,”Spoken Language Processing”, International Editions.
[25] Vojislav Kecman ,”Learning and Soft Computing”,全華科技圖書.
[26] http://svmlight.joachims.org/.
[27] Vojislav Kecman, Michael Vogt, Te Ming Huang ,”On the equality of kernel AdaTron and sequential minimal optimization in classification and regression tasks and alike algorithms for kernel machines. pp. 215-222.