研究生: |
廖皇量 Huang-liang Liao |
---|---|
論文名稱: |
國語歌聲合成信號品質改進之研究 Improving of Signal Quality for Mandarin Singing Voice Synthesis |
指導教授: |
古鴻炎
Hung-Yan Gu |
口試委員: |
張智星
Jyh-Shing Jang 黃紹華 Shaw-Hwa Hwang 洪西進 Shi-Jinn Horng 鍾國亮 Kuo-Liang Chung |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2006 |
畢業學年度: | 94 |
語文別: | 中文 |
論文頁數: | 83 |
中文關鍵詞: | 歌聲 、諧波加噪音模型 、捲積噪音 、相位同步處理 、ADSR音長分配 |
外文關鍵詞: | singing voice, harmonic plus noise model, convolution noise |
相關次數: | 點閱:169 下載:3 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文以諧波加噪音模型為基礎,研究改進國語歌聲信號的合成模型,以獲得更好的聲音品質。在參數分析方面,我們修改了音框基頻值的偵測作法;改成以動態設定門檻值的方式,來決定最大有聲頻率的振幅;並且改進了諧波追蹤的作法,以解決音框間諧波不連續的情況。在信號合成方面,為了提升歌聲信號的自然度,我們研究加入了相位資訊的處理方法;採取ADSR方式來作時間長度的伸長、縮短;並且在模型上加入Convolution Noise的分析與產生,來加強原始模型在低頻部份的不足。前述方法製作成軟體後,進行聽測估評之實驗,初步結果顯示,本論文研究的合成模型,可以使國語歌聲的自然度與清晰度顯著提升。
In this thesis, we study to construct a model for Mandarin singing voice synthesis. This model is improved and extended frame a base model, harmonic plus noise model, and can generate sound signal of better quality. In the analysis of model parameters, we improve the method for detecting the pitch of a signal frame, use dynamic threshold to determine the maximum-voiced frequency of a frame, and improve the partial tracking method to solve the problem of discontinuous harmonics between adjacent frames. In using the model to synthesize signal waveform, we design and add several processing modules to improve the naturalness of synthesized singing signal. These modules include (a) phase-synchronization processing, (b) duration increasing and decreasing based on ADSR, and (c) convolution noise analysis and synthesis processing to provide the modeling ability for low-frequency part. We have used the improved model to implement a real-time Mandarin singing voice synthesis system. Then, a perception test is performed to evaluate our system. The result shows that the naturalness and clearness of the synthetic singing voice by our system are significantly better than those by previous systems.
[1] Vesa Siivola, “A survey of methods for the synthesis of the singing voice”, Presentation of the course S-89.155, sound synthesis, 2002.
[2] 陳安璿,整合MIDI伴奏之歌唱聲合成系統,國立台灣科技大學資訊工程研究所碩士論文,台北,2004。
[3] Yannis Stylianou, Harmonic plus Noise Models for Speech, combined with Statistical Methods, for Speech and Speaker Modification, Ph.D. thesis, Ecole Nationale Supèrieure des Télécommunications, Paris, France, 1996.
[4] Yannis Stylianou, “Applying the Harmonic plus Noise Model in Concatenative Speech Synthesis,” IEEE Trans. Speech and Audio Processing, Vol.9, No.1, pp.21-29, 2001.
[5] Yannis Stylianou, “A simple and fast way for generating a harmonic signal,” IEEE Signal Processing Letters, Vol.7, No.5, pp.111-113, 2000.
[6] O. Capp'e, J. Laroche, and E. Moulines, “Regularized estimation of cepstrum envelope from discrete frequency points,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp.213-216, 1995.
[7] Dogulas O’Shaughnessy, Speech Communication: Human and Machine, 2’nd ed., IEEE Press, 2000.
[8] Articulatory speech synthesis, http://wwwicg.informatik.uni-rostock.de/~piet/speak_main.html
[9] Charles Dodge, and Thomas A. Jerse, Computer Music: Synthesis, Composition, and Performance, 2’nd ed., Schirmer Books, 1997.
[10] 王小川,語音信號處理,全華科技圖書股份有限公司,台北,2004。
[11] T. Dan, B. Mukherjee, and A. Datta, “Temporal approach for synthesis of singing,” In Proceedings of the Stockholm Music Acoustics Conference, pp.282-287, 1993.
[12] Mathieu Lagrange, Sylvain Marchand, and Jean-Bernard Rault, “Using Linear Prediction to Enhance the Tracking of Partials,” In Proceedings of the IEEE International Conference on Speech and Signal Processing, 2004.
[13] Mathworld, http://mathworld.wolfram.com/
[14] 古鴻炎、張小芬、吳俊欣,「仿趙氏音高尺度之基週軌跡正規化方法及其應用」,第十六屆自然語言及語音處理研討會,台北,2004
[15] T. H. Andersen, and K. Jensen, “Phase modeling of instrument sounds based on psycho acoustic experiments,” In Proceedings of the MOSART Workshop on Current Research Directions in Computer Music, pp.170-173, 2001.
[16] Y. Meron, and K. Hirose, “Synthesis of vibrato singing,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.2, pp.745-748, 2000.
[17] Streaming Wave Files with DirectSound, http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/dnarsound/html/msdn_streams3.asp
[18] 盛思豪,即時歌唱聲合成系統與音樂合成系統之整合,碩士論文,國立台灣科技大學電機研究所,2002。