研究生: |
梁弘學 Hung-Hsueh Liang |
---|---|
論文名稱: |
英語歌聲合成之研究 A Study on English Singing Voice Synthesis |
指導教授: |
古鴻炎
Hung-Yan Gu |
口試委員: |
余明興
Ming-Shing Yu 王新民 Hsin-Min Wang 鍾國亮 Kuo-Liang Chung 林彥君 Yen-Chun Lin |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2009 |
畢業學年度: | 97 |
語文別: | 中文 |
論文頁數: | 75 |
中文關鍵詞: | 英語歌聲合成 |
外文關鍵詞: | English singing voice synthesis |
相關次數: | 點閱:251 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文研究了英語歌聲的合成,首先收集約10,000個音節的語料,但只挑選出1,389個不同的音節,因此研究提出一個音節單元建造的方法,包括音節前後串接子音和半音節接合的方式,用以解決音節單元不足的問題。信號模型採用的是諧波加噪音模型(harmonic plus noise model),亦即音節單元的接合形成是在HNM參數的層次。此外,應用國語的ANN模型來產生英語音節的抖音參數,製作動態滿度設定和音量調整程式,以合成出較為自然的歌聲信號。目前已初步完成一個英語的歌聲合成系統,進行主觀的聽測實驗的結果是,以半音節作接合和優先選擇音節單元,兩種方式的合成歌聲幾乎無差異;另外,與一套市面販售的軟體作比較,兩者的評分亦相當接近。
In this thesis, synthesis of English singing voices is studied. First, 10,000 syllables are segmented from real English sentences. However, only 1,389 different syllables are obtained. Therefore, we propose a syllable-unit construction method to solve the problem of strict lack of synthesis units. The method is to try appending a consonant to the end or front of an existing syllable, or concatenating two semi-syllable units. Signal model used here is based on harmonic-plus-noise model (HNM). This implies that the construction of a syllable unit is done as forming a sequence of frames of HNM parameters. To synthesize more natural singing voice, we apply an ANN model trained by Mandarin songs to generate vibrato parameters for an English syllable. Also, we have implemented the functions of dynamic syllable duration adjusting and volume control. Now an English singing voice synthesis system has been initially built. Synthetic songs are used to perform perception tests. The results of the tests show that the difference between selecting syllable-unit first and forced concatenation of semi-syllable units is nearly indistinguishable. In addition, when compared with a commercial singing-voice synthesis package, our system’s score is very close to that system’s score.
[1] Y. Stylianou, Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification, Ph.D. thesis, Ecole Nationale Superieure des Telecommunications, Paris, France, 1996.
[2] 古鴻炎、廖皇量, 「用於國語歌聲合成之諧波加噪音模型的改進研究」, WOCMAT 2006 國際電腦音樂與音訊技術研討會,session 2 (音訊處理I), 台北,(2006)。
[3] 古鴻炎、林正甫, 「使用ANN抖音參數模型之國語歌聲合成」, WOCMAT 2008 電腦音樂與音訊技術研討會 (台北),Session I,(2008)。
[4] Carnegie Mellon University, The CMU Pronouncing Dictionary. http://www.speech.cs.cmu.edu/speech/
[5] V. Siivola, ”A survey of methods for the synthesis of the singing voice”, Presentation of the course S-89.155, sound synthesis, 2002.
[6] J. Bonada, O. Celma, A. Loscos, J. Ortola, X. Serra, Y. Yoshioka, H. Kayama, Y. Hisaminato et H. Kenmochi, “Singing voice synthesis combining excitation plus resonance and sinusoidal plus residual models”, Proc. ICMC 2001, La Habana, Cuba, Sept. 2001
[7] H. Kawahara, I. Masuda-Katsuse, A. de Cheveigne, “Restructuring speech representations using a pitch adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction : Possible role of a repetitive structure in sounds”, Speech Communication, vol.27, 2001
[8] 謝明峰,使用大量語料庫的中文語音合成系統實作,國立清華大學資訊工程研究所碩士論文,2003。
[9] Yamaha, VOCALOID, New Singing Synthesis Technology. http://www.vocaloid.com/en/index.html
[10] C. Dodge, and Thomas A. Jerse, Computer Music: Synthesis, Composition, and Performance, 2’nd ed., Schirmer Books, 1997.
[11] 王小川,語音信號處理,全華科技圖書股份有限公司,台北,2004。
[12] K. Sjlander and J. Beskow, Centre of Speech Technolodge at KTH, http://www.speech.kth.se/wavesurfer/
[13] 王國憲,基於聲學特性之國語語音合成流暢度改進之研究,國立台灣科技大學資訊工程研究所碩士論文,2003。
[14] V. Kyritsi, A. Georgaki and G. Kouroupetroglou, "A score-to-singing voice synthesis system for the greek language", Proceedings of the International Computer Music Conference '07 (ICMC07) Copenhagen, 27-31 August 2007.
[15] 周彥佐,基於HNM之國語、閩南語的語音合成研究,國立台灣科技大學資訊工程研究所碩士論文,2007。
[16] Zero-g, Virtual female vocalist from zero-g, powered by Yamaha VOCALOID singing synthesis technology.
http://www.zero-g.co.uk/