研究生: |
周彥佐 Yan-zuo Jhou |
---|---|
論文名稱: |
基於HNM 之國語、閩南語的語音合成研究 Synthesis of Mandarin and Min-nan Speech Based on Harmonic-plus-noise Model |
指導教授: |
古鴻炎
Hung-Yan Gu |
口試委員: |
余明興
Ming-Shing Yu 王新民 Hsin-Min Wang 黃紹華 Shaw-Hwa Hwang 鍾國亮 Kuo-Liang Chung |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 70 |
中文關鍵詞: | 諧波加噪音模型 、語音合成 |
外文關鍵詞: | HNM, speech synthesis |
相關次數: | 點閱:156 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文以諧波加噪音模型(harmonic plus noise model)為基礎,研究合成語音信號的清晰度的提升方法,以便在韻律參數改變很多時,仍然能合成出清晰的語音。為了作驗證,我們也初步建置了一個國語和閩南語的語音合成系統。研究過程中,參考電腦音樂裡的觀念,延伸ADSR的概念對音節裡的信號區段,作長度的分配及片段線性時間對映,以改進流暢度;此外,在確保音色一致的條件下,嘗試實現HNM的音節基週軌跡之調整作法;至於語音合成上的韻律參數產生,則主要是參考或直接使用前人的研究成果。對於所建置的語音合成系統,我們作了初步的聽測評估,評分結果顯示,所研究的HNM合成法,其合成出語音的清晰度,的確可以比前人研究的TIPW法獲得顯著的改進。
In the thesis, harmonic-plus-noise model (HNM) is based to study signal-clarity improving methods. It is intended that clear speech signals can still be synthesized when the values of the controlling prosody parameters vary within sufficiently wide range. To verify the methods studied here, we have built an initial speech synthesis system for Mandarin and Min-nan. In the study of this thesis, several concepts found in computer music synthesis are adopted and extended. The concept of ADSR is adopted to do duration allocation for the constitution parts of a syllable and to do piece-wise linear time mapping in order to have improved fluency. In addition, the concept of timbre-consistency rule is followed to convert the pitch-contour of a syllable to a desired one with HNM’s signal processing. As to the generation of prosody parameter values, programs developed or rules proposed in previous studies by others are directly used here. To evaluate the speech synthesis system built, initial perception tests are executed. The result scores of the tests show that the signal clarity is indeed improved a lot. That is, the HNM based synthesis method studied here perform much better than the previously proposed method, TIPW.
[1] Yannis Stylianou, "Applying the Harmonic plus Noise Model in Concatenative Speech Synthesis", IEEE Trans. Speech and Audio Processing, Vol.9, No.1, pp.21-29, 2001.
[2] Chiu-yu Tseng and Yeh-lin Lee, "Speech Rate and Prosody Units: Evidence of Interaction from Mandarin Chinese", Proceedings of the International Conference on Speech Prosody 2004, Nara, Japan, pp.251-254, Mar 2004.
[3] Dennis H. Klatt, "Software for a cascade/parallel formant synthesizer", The Journal of the Acoustical Society of America, pp.971-995, March 1980.
[4] Moulines E. and Charpentier F. "Pitch Synchronous Waveform Processing Techniques for Text-to-Speech Synthesis Using Diphones" Speech Communication, Vol.9, pp.453-467, 1990.
[5] Min Chu, et al., "Microsoft Mulan - a bilingual TTS system", ICASSP '03, Vol.1, pp. 264-267, 2003.
[6] Yannis Stylianou, Harmonic plus Noise Models for Speech, combined with Statistical Methods, for Speech and Speaker Modification, Ph.D. thesis, Ecole Nationale Suprieure des Tlcommunications, Paris, France, 1996.
[7] Yannis Stylianou, "A simple and fast way for generating a harmonic signal", IEEE Signal Processing Letters, Vol.7, No.5, pp.111-113, 2000.
[8] Cappe, O., Laroche, J. and Moulines, E., "Regularizedestimation of cepstrum envelope from discrete frequencypoints", IEEE ASSP Workshop on Applications of SignalProcessing to Audio and Acoustics, New York, 1995.
[9] Mathieu Lagrange, Sylvain Marchand, and Jean-Bernard Rault, "Using Linear Prediction to Enhance the Tracking of Partials," In Proceedings of the IEEE International Conference on Speech and Signal Processing (ICASSP'04), Montreal, Quebec, Canada, May 2004.
[10] 周福強,以語料庫為基礎之新一代中文文句翻語音合成技術,國立臺灣大學電機工程研究所,1998。
[11] 楊叡承,以華台雙語資訊及韻律調整為改進之台語文字轉語音系統,長庚大學資訊工程研究所碩士論文,2002。
[12] 張唐瑜,以大量詞彙做為合成單元的中文文轉音系統,國立中興大學資訊科學研究所碩士論文,2003。
[13] 楊東敏,基於線性預測編碼及音框週期同步之高品質語音變換技術,國立中央大學通訊工程所碩士論文,2003。
[14] 謝明峰,使用大量語料庫的中文語音合成系統實作,國立清華大學資訊工程所碩士論文,2003。
[15] 廖皇量,國語歌聲合成信號品質改進之研究,國立台灣科技大學資訊工程研究所碩士論文,2006。
[16] 黃維,以混合模型產生閩南語音節基週軌跡之研究,國立台灣科技大學資訊工程研究所碩士論文,2005。
[17] 許文龍,使用時間比例基週波形內差之國語語音合成器,國立台灣科技大學電機工程研究所碩士論文,1996。
[18] 任文采,國語文句翻語音系統單音音長預估模式之研究,國立中興大學應用數學研究所碩士論文,1997。
[19] 潘能煌,中文文句翻語音系統之音量音調韻律研究,國立中興大學應用數學研究所碩士論文,1998。
[20] 陳安璿,整合MIDI伴奏之歌唱聲合成系統,國立台灣科技大學資訊工程研究所碩士論文,2004。
[21] 李雪貞,客語語音合成之初步研究,國立台灣科技大學資訊工程研究所碩士論文,2001。
[22] Charles Dodge and Thomas A. Jerse, Computer Music: Synthesis, Composition, and Performance, 2'nd ed , Schirmer Books, 1997.
[23] 吳昆松,通用台語字典,台北市南天書局,2003。
[24] Wikipedia,台語通用拼音,
http://zh.wikipedia.org/wiki/%E5%8F%B0%E8%AA%9E%E9%80%9A%E7%94%A8%E6%8B%BC%E9%9F%B3.
[25] 林奇,普通話學習網,http://pth.linqi.org/