簡易檢索 / 詳目顯示

研究生: 林正甫
Zhang-fu Lin
論文名稱: 使用ANN抖音參數模型之國語歌聲合成
Mandarin Singing Voice Synthesis Using ANN Vibrato Parameter Models
指導教授: 古鴻炎
Hung-Yan Gu
口試委員: 林伯慎
Bor-shen Lin
范欽雄
Chin-shyurng Fahn
陳錫明
Si-ming Chen
王新民
Hsin-min Wang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 中文
論文頁數: 84
中文關鍵詞: 國語歌聲合成抖音
外文關鍵詞: mandarin singing synthesis, vibrato
相關次數: 點閱:99下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本論文針對歌聲表情的一個重要因素“抖音”,研究以短時傅利葉轉換和解析信號之方法來對歌聲音節作分析,而求得抖音參數。此外我們也將這個分析方法,應用於求取波形包絡的振動參數。求得各個音節的抖音和振動參數之後,再拿去訓練各項參數分別的類神經網路(artificial neural network, ANN)模型,之後依據所建造的ANN模型的輸出,再配合滿度、下拍點等規則,去控制諧波加噪音信號模型 (HNM)作歌聲信號的合成。經由主觀的自然度聽測實驗,所得的評分顯示,同時使用抖音和振動參數合成出的歌聲信號,的確可以比原始使用HNM合成出的歌聲信號有顯著的改進。


In this thesis, analysis and synthesis of vibrato, an important factor of singing expression, are focused. We analyze the vibrato parameters of a singing syllable by using short-time Fourier transform and the method of analytic signal. In addition, we apply the same procedure to analyze the vibrating parameters from a syllable’s waveform envelope curve. When the parameter values of vibrato and amplitude vibrating are obtained for each singing syllable, they are used to train an artificial neural network (ANN) based model for each different parameter type. Then, these ANN models are used to generate the vibrato and vibrating parameters. Next, these parameters and other relevant music parameters are used together to control a harmonic-plus-noise (HNM) model to synthesize singing voice signals. With the synthetic singing voices, subjective perception tests are conducted. The result show that the singing signal synthesized with the control of vibrato and vibrating parameters is indeed apparently better than the singing signal synthesized without such controls.

摘要I AbstractII 致謝III 目錄IV 第1章緒論1 1.1 研究動機及目的1 1.2 歌聲合成研究之回顧2 1.3 研究方法4 1.4論文架構5 第2章信號之抖音參數分析6 2.1 抖音參數分析前置作業6 2.2 抖音參數求取方式回顧7 2.3 基週峰谷法7 2.4瞬間頻率法9 2.4.1STFT分析9 2.4.2Analytic Signal分析9 2.5抖音參數分析之實驗與改進作法11 2.5.1基週峰谷法之實驗11 2.5.2瞬間頻率法之實驗13 2.5.3瞬間頻率分析-使用Analytic Signal14 2.5.4瞬間頻率分析-使用STFT17 2.5.5音位軌跡計算19 2.5.6時變抖音頻率、範圍測量21 2.5.7抖音參數取樣、儲存與正規化22 2.6波形包絡振動參數之分析24 第3章類神經網路模型28 3.1 類神經網路簡介28 3.2 類神經網路結構29 3.3 類神經網路輸出入參數31 3.3.1輸入參數31 3.3.2輸出參數35 3.4 單元個數實驗36 第4章國語歌聲合成41 4.1 表情參數之產生41 4.1.1音高曲線的產生41 4.1.2波形包絡的產生45 4.2 音樂性參數之決定47 4.2.1滿度處理47 4.2.2下拍點處理48 4.2.3轉音處理49 4.2.4歌聲音量處理51 4.3 結合抖音與波形包絡HNM之歌聲合成系統52 第5章實驗實驗與結論55 5.1 歌聲合成系統55 5.2 聽測評估58 5.3 結論60 參考文獻63 作者簡介67 附錄A68 【訓練歌曲之歌詞】68

[1]古鴻炎、陳安璿、廖皇量,「整合MIDI伴奏之國語歌聲合成系統」,WOCMAT 2005 電腦音樂與音訊技術研討會(台北),Session B,2005。
[2]古鴻炎、廖皇量,「用於國語歌聲合成之諧波加噪音模型的改進研究」,WOCMAT 2006 國際電腦音樂與音訊技術研討會(台北),session 2 (音訊處理I),2006。
[3]G. Grindlay and D. Helmbold, “Modeling, analyzing, and synthesizing expressive piano performance with graphical models”, Springer Netherlands, Vol. 65, pp. 361-387, Dec. 2006.
[4]Seashore, C. E. “The vibrato”, in University of Iowa Studiesin the Psychology of Music(Univ. of Iowa, Iowa City), Vol. I., 1932.
[5]Sundberg, J. “Effects of the vibrato and the ‘singing formant’ on pitch”, Musica Slovaca VI, 1978, Bratislava, 51–69; also J. Res. Singing 5(2), 5–17. 1978.
[6]Horii, Y. “Acoustic analysis of vocal vibrato: a theoretical interpretation of data“, J. Voice 3, 36–43. 1989.
[7]Imaizumi, S., Saida, H., Shimura, Y., and Hirose, H. “Harmonic analysis of the singing voice:—Acoustic characteristics of vibrato“, in Proceedings of the Stockholm Music Acoustics Conference (SMAC93) Royal Swedish Academy of Music, Stockholm, pp. 197–200. 1994.
[8]Sundberg, J., Prame, E., and Iwarsson J. “Replicability and Accuracy of Pitch Patterns in Professional Singers“, in Vocal Fold Physiology, edited by P. J. Davis and N. H. Fletcher (Singular, San Diego), 1996.
[9]Shonle, J. I., and Horan, K. E. “The pitch of vibrato tones“, J. Acoust. Soc. Am. 67, 246–252. 1980.
[10]Brown, J. C., and Vaughn, K. V. “Pitch center of stringed instrument vibrato tones“, J. Acoust. Soc. Am. 100, 1728–1735. 1996.
[11]E. Prame“Vibrato extent and intonation in professional western lyric singing”, J. Acoust. Soc. Am., Vol. 102, pp. 616-621, 1997.
[12]I. Arroabarren, et al.,“Measurement of vibrato in lyric singers”, IEEE instrumentation and measurement technology conference, pp. 1529-1534, 2001.
[13]Yorum Meron and Keikichi Hirose, “Synthesis of Vibrato Singing”, Proceedings of the Acoustics, Speech, and Signal Processing on IEEE International Conference, 2000.
[14]Michael W. Macon, Leslie Jensen-Link, James Oliverio, Mark A. Clements and E. Bryan George, “Concatenation-based MIDI-to-Singing Voice Synthesis,” 103rd Meeting of the AES, Sept. 1997.
[15]Takeshi Saitou, Masashi Unoki, and Masato Akagi, “Extraction of F0 Dynamic Characteristics and Development of F0 Control Model in Singing Voice,” Proceedings of the 2002 International Conference on Auditory Display, Kyoto, Japan, July 2-5, 2002.
[16]周彥佐, 基於HNM之國語、閩南語的語音合成研究, 國立台灣科技大學資訊工程研究所碩士論文, 2007。
[17]E. Prame, “Measurements of the vibrato rate of ten singers”, J. Acoust. Soc. Am., Vol. 96, pp. 1979-1984, 1994.
[18]K. Kato, et al., “Blending vocal music with the sound field - the effective duration of autocorrelation function of western professional singing voices with different vowels and pitches”, International Symposium on Musical Acoustics (ISMA2004), Nara, Japan, 2004.
[19]B. Boashash,“Estimating and interpreting the instantaneous frequency of a signal, Part I: Fundamentals”, Proceedings of the IEEE, Vol. 80, pp. 519-538, April 1992.
[20]B. Boashash,“Estimating and interpreting the instantaneous frequency of a signal. Part 2: Algorithms and applications”, Proceedings of the IEEE, Vol. 80, pp. 539-568, April 1992.
[21]P. Howes, et al.,“The relationship between measured vibrato characteristics and perception in western operatic singing”, Journal of Voice, Vol. 18, pp. 216-230, 1997.
[22]J. Schoukens, R. Pintelon, and H. Van Hamme,“The interpolated fast fourier transform: A comparative study”, IEEE trans. Instrum. Meas., Vol. 41, pp. 226-232, April 1992.
[23]H. G. Feichtinger and T. Strohmer, Gabor analysis and algorithms theory and applications, Birkhauser, Boston, Dec. 1997.
[24]D. G. Long, “Comments on Hilbert transform based signal analysis”, Microwave Earth Sensing (MERS) Laboratory, Feb. 2004.
[25]M. Johansson, “The Hilbert transform”, Math. Dept., Växjö Universitet, Sweden, http:// w3.msi.vxu.se/exarb/
[26]古鴻炎、張小芬、吳俊欣,「仿趙氏音高尺度之基週軌跡正規化方法及其應用」,第十六屆自然語言與語音處理研討會(ROCLING XVI),台北,第325-334頁, 2004。
[27]Hideo Suzuki, et al., “Instantaneous frequencies of signals obtained by the analytic signal method”, Acoust. Sci. & Tech, Vol. 27, pp. 163-170, 2006.
[28]T. Wakayama, et al., “Comparison of violin vibratos among four virtuost”, Proceedings of the International Symposium on Musical Acoustics (ISMA2004), Nara, Japan.
[29]C. Langton,“Hilbert transform, analytic signal and the complex envelope”,LoralSpaceSystems, http://www.complextoreal.com/tcomplex.htm.
[30]葉怡成, 類神經網路模式應用與實作, 儒林圖書公司, 2006。
[31]曹亦岑,使用小型語料類神經網路之國語語音合成韻律參數產生,國立台灣科技大學電機所。
[32]王如江,基於歌聲表情分析與單元選擇之國語歌聲合成研究,國立台灣科技大學資訊工程研究所碩士論文, 2007。

QR CODE