簡易檢索 / 詳目顯示

研究生: 廖皇量
Huang-liang Liao
論文名稱: 國語歌聲合成信號品質改進之研究
Improving of Signal Quality for Mandarin Singing Voice Synthesis
指導教授: 古鴻炎
Hung-Yan Gu
口試委員: 張智星
Jyh-Shing Jang
黃紹華
Shaw-Hwa Hwang
洪西進
Shi-Jinn Horng
鍾國亮
Kuo-Liang Chung
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2006
畢業學年度: 94
語文別: 中文
論文頁數: 83
中文關鍵詞: 歌聲諧波加噪音模型捲積噪音相位同步處理ADSR音長分配
外文關鍵詞: singing voice, harmonic plus noise model, convolution noise
相關次數: 點閱:169下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文以諧波加噪音模型為基礎,研究改進國語歌聲信號的合成模型,以獲得更好的聲音品質。在參數分析方面,我們修改了音框基頻值的偵測作法;改成以動態設定門檻值的方式,來決定最大有聲頻率的振幅;並且改進了諧波追蹤的作法,以解決音框間諧波不連續的情況。在信號合成方面,為了提升歌聲信號的自然度,我們研究加入了相位資訊的處理方法;採取ADSR方式來作時間長度的伸長、縮短;並且在模型上加入Convolution Noise的分析與產生,來加強原始模型在低頻部份的不足。前述方法製作成軟體後,進行聽測估評之實驗,初步結果顯示,本論文研究的合成模型,可以使國語歌聲的自然度與清晰度顯著提升。


    In this thesis, we study to construct a model for Mandarin singing voice synthesis. This model is improved and extended frame a base model, harmonic plus noise model, and can generate sound signal of better quality. In the analysis of model parameters, we improve the method for detecting the pitch of a signal frame, use dynamic threshold to determine the maximum-voiced frequency of a frame, and improve the partial tracking method to solve the problem of discontinuous harmonics between adjacent frames. In using the model to synthesize signal waveform, we design and add several processing modules to improve the naturalness of synthesized singing signal. These modules include (a) phase-synchronization processing, (b) duration increasing and decreasing based on ADSR, and (c) convolution noise analysis and synthesis processing to provide the modeling ability for low-frequency part. We have used the improved model to implement a real-time Mandarin singing voice synthesis system. Then, a perception test is performed to evaluate our system. The result shows that the naturalness and clearness of the synthetic singing voice by our system are significantly better than those by previous systems.

    摘要 I ABSTRACT II 誌謝 III 目錄 IV 圖表索引 VI 第1章 緒論 1 1.1 研究動機及目的 1 1.2 歌聲合成研究之回顧 2 1.3 研究方法 6 1.4 論文架構 8 第2章 HNM之分析及合成方法 9 2.1 HNM簡介 9 2.2 分析方法 11 2.2.1 音高偵測 11 2.2.2 諧波部份分析 12 2.2.3 噪音部份分析 14 2.3 合成方法 15 2.3.1 諧波部份合成 16 2.3.2 噪音部份合成 18 第3章 合成音品質之改進 20 3.1 Convolution Noise產生 20 3.1.1 Convolution Noise參數分析 21 3.1.2 合成Convolution Noise 23 3.2 相位處理 26 3.2.1 相位增量控制 26 3.2.2 諧波相位關係調整 28 3.2.3 相對於基頻之相位延遲 29 3.3 ADSR式之音長伸縮 33 3.3.1 ADSR簡介 33 3.3.2 音節分段 35 3.4 諧波追蹤(Partial Tracking) 38 第4章 國語歌聲合成 41 4.1 國語音節信號處理 41 4.1.1 分析階段 41 4.1.2 合成階段 46 4.2 歌唱技巧模擬 50 4.2.1 轉音 50 4.2.2 抖音 53 4.3 歌聲合成系統製作 55 4.3.1 歌詞檔處理 55 4.3.2 韻律參數決定 56 4.3.3 短音長的處理 58 4.3.4 播放處理 58 第5章 實驗驗證 60 5.1 合成信號之比較 60 5.1.1 Convolution Noise 60 5.1.2 相位調整 61 5.1.3 ADSR 63 5.2 聽測評估 64 第6章 結論 68 參考文獻 71 作者簡介 73

    [1] Vesa Siivola, “A survey of methods for the synthesis of the singing voice”, Presentation of the course S-89.155, sound synthesis, 2002.
    [2] 陳安璿,整合MIDI伴奏之歌唱聲合成系統,國立台灣科技大學資訊工程研究所碩士論文,台北,2004。
    [3] Yannis Stylianou, Harmonic plus Noise Models for Speech, combined with Statistical Methods, for Speech and Speaker Modification, Ph.D. thesis, Ecole Nationale Supèrieure des Télécommunications, Paris, France, 1996.
    [4] Yannis Stylianou, “Applying the Harmonic plus Noise Model in Concatenative Speech Synthesis,” IEEE Trans. Speech and Audio Processing, Vol.9, No.1, pp.21-29, 2001.
    [5] Yannis Stylianou, “A simple and fast way for generating a harmonic signal,” IEEE Signal Processing Letters, Vol.7, No.5, pp.111-113, 2000.
    [6] O. Capp'e, J. Laroche, and E. Moulines, “Regularized estimation of cepstrum envelope from discrete frequency points,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp.213-216, 1995.
    [7] Dogulas O’Shaughnessy, Speech Communication: Human and Machine, 2’nd ed., IEEE Press, 2000.
    [8] Articulatory speech synthesis, http://wwwicg.informatik.uni-rostock.de/~piet/speak_main.html
    [9] Charles Dodge, and Thomas A. Jerse, Computer Music: Synthesis, Composition, and Performance, 2’nd ed., Schirmer Books, 1997.
    [10] 王小川,語音信號處理,全華科技圖書股份有限公司,台北,2004。
    [11] T. Dan, B. Mukherjee, and A. Datta, “Temporal approach for synthesis of singing,” In Proceedings of the Stockholm Music Acoustics Conference, pp.282-287, 1993.
    [12] Mathieu Lagrange, Sylvain Marchand, and Jean-Bernard Rault, “Using Linear Prediction to Enhance the Tracking of Partials,” In Proceedings of the IEEE International Conference on Speech and Signal Processing, 2004.
    [13] Mathworld, http://mathworld.wolfram.com/
    [14] 古鴻炎、張小芬、吳俊欣,「仿趙氏音高尺度之基週軌跡正規化方法及其應用」,第十六屆自然語言及語音處理研討會,台北,2004
    [15] T. H. Andersen, and K. Jensen, “Phase modeling of instrument sounds based on psycho acoustic experiments,” In Proceedings of the MOSART Workshop on Current Research Directions in Computer Music, pp.170-173, 2001.
    [16] Y. Meron, and K. Hirose, “Synthesis of vibrato singing,” IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.2, pp.745-748, 2000.
    [17] Streaming Wave Files with DirectSound, http://msdn.microsoft.com/archive/default.asp?url=/archive/en-us/dnarsound/html/msdn_streams3.asp
    [18] 盛思豪,即時歌唱聲合成系統與音樂合成系統之整合,碩士論文,國立台灣科技大學電機研究所,2002。

    QR CODE