簡易檢索 / 詳目顯示

研究生: 劉立祥
Li-hsiang Liu
論文名稱: 動態更新語者模型及語者驗證
Dynamic reconstruction speaker-dependent model and speaker verification
指導教授: 洪西進
Shi-jinn Horng
口試委員: 梅興
Hsing Mei
王有禮
Yue-li Wang
王振興
Jeen-shing Wang
楊昌彪
Chang-biau Yang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 79
中文關鍵詞: 語者驗證隱藏式馬可夫模型
外文關鍵詞: speaker verification, hidden markov model
相關次數: 點閱:250下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文主要針對於語者驗證系統中,如何降低語者模型老化與模型更新不易而造成系統效能無法維護,提出一套動態更新語者模型的策略與流程,利用驗證時所輸入之驗證語句,與原有之訓練語句,重新訓練語者模型,使模型狀態與語者保持一致。系統實作中,以梅爾倒頻譜參數(Mel-Frequency Cepstral Coefficients, MFCCs)做為語者特徵,結合隱藏式馬可夫模型(Hidden Markov Model, HMM)建立語者相依模型,驗證時,若測試語句驗證成功,則進入動態更新語者模型流程。
    實驗結果顯示,兩套分別有53與50位語者的中文語料庫,藉由動態更新流程,可分別有效改善相等錯誤率(Equal Error Rate, EER) 39.87%與49.97%,最佳EER分別為2.226%與2.25%,此外,藉由YOHO語料庫利用134人建立語者模型,可得最佳EER為2.238%,並將系統介面設計為較有親和力結合可即時錄音之語者驗證系統。


    This paper focused on how to resolve the problem of maintaining the performance in the speaker verification system. It proposed a new strategy that provided the flow of dynamic reconstruction speaker-dependent model. Mel-Frequency Cepstral Coefficients (MFCC) method is used for the feature extraction of the training and testing vectors. The system combined Hidden Markov Model (HMM) as the classifier to establish the speaker-dependent model.

    From experimental results, it shows the new strategy really can improve the Equal Error Rate (EER) 39.87% and 49.97% and also achieves 2.226% and 2.25% EER values based on two different Chinese corpuses, respectively. It also used YOHO corpus to establish 134 speaker models, the best result achieves 2.238% EER value.

    中文摘要 Ⅰ 英文摘要 Ⅱ 謝 誌 Ⅲ 目 錄 Ⅳ 圖目錄 Ⅶ 表目錄 Ⅸ 第一章 緒論 1 1.1 研究背景與動機 1 1.2 研究目的 3 1.3 論文架構 6 第二章 背景知識 8 2.1 語者辨識概述 8 2.2 語音訊號前處理 12 2.2.1 語音訊號的產生 12 2.2.2 端點偵測 14 2.2.3 語音分框 17 2.2.4 預強調 18 2.2.5 漢明窗 20 2.3 特徵參數萃取 22 2.3.1 快速傅利葉轉換 22 2.3.2 梅爾頻譜係數 25 2.3.3 梅爾能量權重 31 2.3.4 對數能量計算 32 2.3.5 離散餘弦轉換 33 2.3.6 轉移倒頻譜係數 34 第三章 隱藏式馬可夫模型 35 3.1 模型描述 35 3.2 模型參數 37 3.3 維特比演算法 45 第四章 系統架構 51 4.1 系統方塊圖 51 4.2 語者訓練、驗證與模型更新流程 52 4.2.1 語者訓練流程 52 4.2.2 語者驗證流程 54 4.2.3 語者模型動態更新流程 56 4.3 系統效能評估方式 58 第五章 系統實作與分析 60 5.1 開發環境 60 5.2 系統實作 60 5.3 實驗結果與分析 64 5.3.1 語音資料庫 64 5.3.2 多媒體實驗室語料庫(A)語者驗證結果 64 5.3.3 多媒體實驗室語料庫(B)語者驗證結果 67 5.3.4 YOHO Corpus英文語料庫語者驗證結果 68 5.3.5 實驗數據與分析 69 第六章 結論 75 參考文獻 76

    [1]Gonzalez-Rodriguez, J.; Fierrez-Aguilar, J.; Ortega-Garcia, J.; ”Forensic identification reporting using automatic speaker recognition systems”, Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on , Volume: 2 , 6-10 April 2003 Pages:II - 93-6 vol.2

    [2]Reynolds, D.A.;”An overview of automatic speaker recognition technology”, Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP '02). IEEE International Conference on , Volume: 4 , 13-17 May 2002 Pages:IV-4072 - IV-4075 vol.4

    [3]Teoh, A.; Samad, S.A.; Hussain, A.;”An Internet based speech biometric verification system”, Communications, 2003. APCC 2003. The 9th Asia-Pacific Conference on , Volume: 1 , 21-24 Sept. 2003 Pages:47 - 51 Vol.1

    [4]Bengio, S.; Mariethoz, J.;”Learning the decision function for speaker verification”, Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on , Volume: 1 , 7-11 May 2001 Pages:425 - 428 vol.1

    [5]Adami, A.G.; Mihaescu, R.; Reynolds, D.A.; Godfrey, J.J.;”Modeling prosodic dynamics for speaker recognition”, Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on , Volume: 4 , 6-10 April 2003 Pages:IV - 788-91 vol.4

    [6]Jong Mo Sung; Jung Gon Kim; Ho Young Hur; Myung Gyu Song; Dae Woong
    Kim; ”On the application of speaker verification to electronic -commerce”, Industrial Electronics, 2001. Proceedings. ISIE 2001. IEEE International Symposium on , Volume: 3 , 12-16 June 2001 Pages:2007 - 2010 vol.3

    [7]Inal, M.; Fatihoglu, Y.S.;”Self organizing map and associative memory model hybrid classifier for speaker recognition”, Neural Network Applications in Electrical Engineering, 2002. NEUREL '02. 2002 6th Seminar on , 26-28 Sept. 2002 Pages:71 – 74

    [8]McLaughlin, J.; Reynolds, D.A.;”Speaker detection and tracking for telephone transactions”, Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP '02). IEEE International Conference on , Volume: 1 , 13-17 May 2002 Pages:I-129 - I-132 vol.1

    [9]Qin Jin; Schultz, T.; Waibel, A.;”Speaker identification using multilingual phone strings”, Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP '02). IEEE International Conference on , Volume: 1 , 13-17 May 2002 Pages:I-145 - I-148 vol.1

    [10]Weber, F.; Manganaro, L.; Peskin, B.; Shriberg, E.;”Using prosodic and lexical information for speaker identification”, Acoustics, Speech, and Signal Processing, 2002. Proceedings. (ICASSP '02). IEEE International Conference on , Volume: 1 , 13-17 May 2002 Pages:I-141 - I-144 vol.1

    [11]Zhiyuan He; Qixiu Hu;”A speaker identification system with verification method based on speaker relative threshold and HMM”, Signal Processing, 2002 6th International Conference on , Volume: 1 , 26-30 Aug. 2002 Pages:488 - 491 vol.1

    [12]Carey, M.J.;”User validation for mobile telephones”, Auckenthaler, R.; Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on , Volume: 2 , 5-9 June 2000 Pages:II1093 - II1096 vol.2

    [13]Ruiz, B.; Domingo, P.; Hernandez, L.;”A dual speech/speaker recognition using GMM in speaker identification and a HMM in keyword speech recognition”, Security Technology, 1999. Proceedings. IEEE 33rd Annual 1999 International Carnahan Conference on , 5-7 Oct. 1999 Pages:251 – 254

    [14]Chaudhari, U.V.; Navrratil, J.; Ramaswamy, G.N.; Maes, S.H.;”Very large population text-independent speaker identification using transformation enhanced multi-grained models”, Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on , Volume: 1 , 7-11 May 2001 Pages:461 - 464 vol.1

    [15]Lovekin, J.M.; Yantorno, R.E.; Krishnamachari, K.R.; Benincasa, D.S.; Wenndt, S.J.;”Developing usable speech criteria for speaker identification technology”, Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on , Volume: 1 , 7-11 May 2001 Pages:421 - 424 vol.1

    [16]Farrell, K.R.;”Adaptation of data fusion-based speaker verification models”, Circuits and Systems, 2002. ISCAS 2002. IEEE International Symposium on , Volume: 2 , 26-29 May 2002 Pages:II-851 - II-854 vol.2

    [17]Yong Gu; Thomas, T.;”A hybrid score measurement for HMM-based speaker verification”, Acoustics, Speech, and Signal Processing, 1999. ICASSP '99. Proceedings., 1999 IEEE International Conference on , Volume: 1 , 15-19 March 1999 Pages:317 - 320 vol.1

    [18]鍾偉仁,語者辨認與語者驗證之初步研究,國立台灣大學電信研究所碩士論文,2000

    [19]黃雪珠,基於小波轉換之語者辨別分析,國立台灣大學工程科學及海洋工程研究所碩士論文,2003

    [20]陳高斌,應用SOM-PNN混合神經網路在語者識別,私立義守大學電機工程研究所碩士論文,2001

    [21]顏銘祥,以DSP為架構的不特定語句即時語者辨識系統,國立中山大學電機工程研究所碩士論文,2004

    [22]古詩峰,基於小波轉換特徵參數以及使用麥克風和電話語料之大量語者識別系統,私立長庚大學電機工程研究所碩士論文,2002

    [23]鄭順德,不特定語句中量語者辨識系統之設計研究,國立中山大學電機工程研究所碩士論文,2002

    [24]黃俊豪,大量語者不特定語句環境下語者辨識系統之特徵設計,國立中山大學電機工程研究所碩士論文,2000

    [25]林青慧,強韌式語者辨識系統:從麥克風、市話到手機,國立清華大學資訊系統與應用所碩士論文,2004

    [26]語音辨識,http://cslin.auto.fcu.edu.tw/scteach/scteach88/Tidsp/ns.htm

    [27]劉曜德,隱藏式馬可夫模型的基本概念,http://gsems.ntctc.edu.tw/center/pubindex.htm

    [28]張照煌,語音辨識與語音合成,http://www.ascc.net/nl/87/1407/04.txt

    [29]中文語音辨識系統應用,http://oz.nthu.edu.tw/~u911808/6-3.htm

    [30]Speech coding,
    http://www.ee.surrey.ac.uk/Research/VSSP/03%20-%20CVSSPMultiSigProcFrameset.html

    [31]王小川,語音訊號處理,全華,2004

    [32]楊鎮光,Visual Basic與語音辨識-讓電腦聽話,松崗,2002

    無法下載圖示 全文公開日期 本全文未授權公開 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE