簡易檢索 / 詳目顯示

研究生: 趙文儀
Wun-yi Jhao
論文名稱: 以參考語者為基礎之語者調適方法研究
A Study on Speaker Adaptation Based on Reference Speakers
指導教授: 林伯慎
Bor-shen Lin
口試委員: 羅乃維
Nai-wei Lo
古鴻炎
Hung-yan Gu
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2010
畢業學年度: 98
語文別: 中文
論文頁數: 33
中文關鍵詞: 語者調適模型加權組合特徵參數轉換最大相似度線性迴歸調適法最大事後機率調適法
外文關鍵詞: speaker adaptation, feature transformation, model combination, MAP, MLLR, reference speakers
相關次數: 點閱:159下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  •  在語者調適的方法中,如何在有限語料的情況下,對辨識率進行有效的提升,一直都是重要的議題。本論文探討了一些以參考語者為基礎的語者調適方法,在只需使用者的少量語料的情況下也能達到好的辨識率。所謂的參考語者是指語料庫裡與目標語者統計特性相近的語者。我們探討的方法包括:直接使用相近的語料進行調適、用參考語者模型做加權組合、以及參考語者特徵轉換等三種方式。進一步,我們對「參考語者特徵轉換」調適法做了修正,加入了目標語者的MLLR_MAP調適模型以改善辨識率。最後,我們結合了特徵參數轉換與模型加權組合這兩種調適法,並討論其對於辨識率的影響。
     由實驗結果得知,在「參考語者特徵轉換」調適法中加入目標語者的調適模型,最多可對於語者不特定模型的辨識率提升約6.67%,對於改善辨識率可以有明顯的提升。而在調整變異數對實驗結果的影響方面,根據實驗結果顯示,在模型重估時使用原來的變異數而不重新調整的話,通常可以有較佳的辨識率。最後在結合「參考語者特徵轉換」與「參考語者模型加權」這兩種調適法的方法中,實驗結果顯示比起原本只使用與目標語者相近的參考語者做加權組合,其實驗結果會有較佳的辨識率。


    How to effectively improve the recognition rate through speaker adaptation has been an important issue for speaker recognition system. In this paper we discuss some speaker adaptation methods based on reference speakers, including finding the speeches or speakers in the database that are close to the target speaker, the acoustic models of the reference speakers, and transforming the features of reference speakers during re-estimation. All these basic methods can achieve improvements on adaptation performances. In addition, we modify and combine the above methods to obtain new adaptation schemes, and achieve better performance than MAP, MLLR methods.
    In the experiment results, compared to speaker independent model, the best obtainable performance is 6.67% based on the feature transformation. Also, when integrating model combination approach and feature transformation approach for reference speakers, a better performance can be achieved. It can be concluded that, it is potentially helpful to utilize the data or models of reference speakers for improving adaptation performance provided the reference speakers are close enough to the target speaker.

    第一章 序論………………………………………………………………1   1.1 研究動機…………………………………………………………1 1.2 論文目的與成果簡介……………………………………………1 1.3 論文架構…………………………………………………………2 第二章 文獻探討與相關技術……………………………………………3   2.1 高斯混合模型……………………………………………………3 2.1.1 高斯混合模型簡介…………………………………………3 2.1.2 篩選參考語者之方法………………………………………4   2.2 語者調適方法……………………………………………………5 2.2.1 最大相似度線性迴歸調適法………………………………6 2.2.2 最大事後機率調適法………………………………………10 2.3 參考語者為基礎的調適方式……………………………………10 2.3.1 「參考語者模型加權」調適法……………………………11 2.3.2 「參考語者特徵轉換」調適法……………………………14 2.4 本章摘要…………………………………………………………17 第三章 基礎實驗…………………………………………………………18 3.1 相近語料之調適…………………………………………………19 3.2 「參考語者模型加權」調適法實驗……………………………22 3.3 「參考語者特徵轉換」調適法實驗……………………………23 3.4 比較變異數的調整與否…………………………………………24 3.5 本章摘要…………………………………………………………25 第四章 實驗改進…………………………………………………………26 4.1 「參考語者特徵轉換」調適法實驗之改進……………………26 4.2 結合「參考語者模型加權」與「參考語者特徵轉換」方法…27 4.3 各種調適法之比較………………………………………………29 4.4 本章摘要…………………………………………………………30 第五章 結論與未來研究方向……………………………………………31 5.1 結論………………………………………………………………31 5.2 未來研究方向……………………………………………………32 參考文獻……………………………………………………………………33

    [1] T. K. Moon, “The Expectation-Maximization Algorithm,” IEEE
    Signal Processing Magazine, vol. 13, no. 6, pp. 47-60, November
    1996.
    [2] Leggetter,C.J. and Woodland,P.C. “Maximum likelihood linear regression
    for speaker adaptation of continuous density hidden Markov models”
    Computer Speech and Language,vol.9,no.2,pp.171-185,April ,1995
    [3] L. Breiman,J.H. Friedman,R.A. Olshen,and C.J. Stone
    (1984),“Classification and Regression Tree,” Wadsworth,California.
    [4] The HTK Book (for HTK Version 3.2.1)
    [5] L. R. Rabiner,“A Tutorial on Hidden Markov Models and Selected  
    Applications in Speech Recognition”,Proc. IEEE,Vol. 77,No.2, pp.
    257-286, Feb. 1989
    [6] Brian Mak*, Tsz-Chung Lai, Roger Hsiao, “Improving Reference Speaker
    Weighting Adaptation By The Use Of Maximum-Likelihood Reference
    Speakrs”,ICASSP 2006
    [7] Teng wenxuan,Guillaume Gravier,Frédéric Bimbot,Frédéric Soufflet “Rapid
    Speaker Adaptation by Reference Model Interpolation” , INTERSPEECH 2007
    [8] Wen Xuan Teng,Guillaume Gravier,Frédéric Bimbot,Frédéric Souffle
    “Speaker Adaptation By Variable Reference MOdel Subspace And
    Application To Large Vocabulary Speech Recognition”, ICASSP 2009.
    [9] Chao Huang,Tao Chen,Eric Chang“Adaptive Model Combination For Dynamic
    Speaker Selection Training”, ICCLP 2002
    [10] Chao Huang,Tao Chen,Eric Chang“Transformation and Combination of Hidden
    Markov Models for Speaker Selection Training”, ICCLP 2004
    [11] M. Padmanabhan,L. R. Bahl,D. Nahamoo,M. A. Picheny “Speaker Clustering
    and Transformation for Speaker Adaptation in Speech Recognition
    Systems”, IEEE TRANS,SPEECH AND SIGNAL PROCESSING 1998
    [12] 麥克風語料庫TCC300Edu (中華民國計算語言學學會發行)

    QR CODE