對於歌唱聲合成器的聲音品質增進之研究｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	龍俊成 Chun-cheng Loong
論文名稱：	對於歌唱聲合成器的聲音品質增進之研究 Improving Singing Voice Quality for Singing Synthesizer
指導教授：	洪西進 Shi-jinn Horng
口試委員:	陳秋華 Chyou-hwa Chen 高宗萬 Tzong-wann Kao
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2009
畢業學年度：	97
語文別：	中文
論文頁數：	51
中文關鍵詞：	Cubic Hermite Spline 、ADSR模型、快速傅立葉轉換、加法合成
外文關鍵詞：	Cubic Hermite Spline, ADSR model, Fast Fourier Transform, additive-synthesis
相關次數：	點閱：199 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

在近年來已經有許多利用電腦來合成出聲音的系統，有用於一般說話的，也有用於唱歌的。在歌唱方面，電腦合成出來的聲音和人類所唱出來的聲音還是有著一段差距，如何將電腦合成聲相似人類發聲仍然是一個研究的課題。
本論文使用加法合成的弦波模型來實作，以期能合成出自然流暢的歌唱聲。在分析數據方面，先對於每相鄰兩音框作快速傅立葉轉換得到頻譜，再利用Cubic Hermite Spline內插法求取出較精確的頻率、振幅及相位值；在合成歌唱聲方面，先利用ADSR模型來計算出合成音各區段的音框數目，然後對於合成音的音高曲線作調整，最後再利用Cubic Hermite Spline內插法來求取出合成音的諧波參數，另外也有模擬出轉折音和抖音的歌唱技巧。
為了評斷本系統合成出來的歌唱聲，進行了實際的聽測實驗，將和之前學長所作的合成系統做比較，實驗結果顯示本論文所使用的方法，確實能夠增加歌唱聲的自然度與流暢度。

In recent years many computer-based synthesis voice systems have been produced. Some of them were used to produce speaking sound; the others were used to produce singing sound. Computer-generated singing sound still has big difference to the human-generated sound. How to make the synthesized voice to be closed to human sound is one example of research topic in this area.
In this thesis, we use additive-synthesis method based on sinusoidal model to implement our system, with the purpose to generate a natural and smooth sound. In analyzing data stage, first we use Fast Fourier Transform to get spectrum of each 2-frame, and then utilize Cubic Hermite Spline in order to have more accurate frequency, amplitude and phase. In the synthesis stage, we apply ADSR model to calculate the number of frames of synthesized sound, and then adjust the pitch curve of synthesized sound, finally, utilize Cubic Hermite Spline to get amplitude of synthesized sound. In addition, we also have a simulation of sound sweeping and vibrato.
In order to evaluate the synthesized sound produced by our system, we have conducted experiments on real-time listening evaluation, and the result is compared to those of other existing systems. Our experimental result reveals that the method used in this thesis can really improve the naturalness and smoothness of the synthesized sound.

摘要                                   I
Abstract                              II
誌謝                                 III
目錄                                  IV
圖目錄                                VI
表目錄                               VII
第一章  緒論                           1
1.1 研究動機及目的                     1
1.2 本論文的研究方法                   2
1.3 論文架構                           4
第二章  語音與歌唱聲合成背景之介紹     5
2.1 人類發聲音腔的結構模型             5
2.2 時域基週同步疊加法                 6
2.3 弦波模型合成法                     8
2.4 其他語音合成法                     9
第三章  弦波模型的歌唱聲合成法        10
3.1 弦波的基本概念                    10
3.2 弦波模型合成法之概念              13
第四章  歌唱聲合成之實作              15
4.1 樣本音的切割                      15
4.2 讀取歌譜及樣本音數值              16
4.2.1 讀取歌譜                        16
4.2.2 讀取樣本音                      18
4.3 頻譜分析                          19
4.4 音長的調整                        25
4.5 調整音高曲線                      28
4.6 頻率、相位及振幅的合成            31
4.7 轉折音及抖音的模擬                33
4.7.1 轉折音的處理                    33
4.7.2 抖音的處理                      35
4.8 合成時域波形                      36
第五章  實驗評測                      38
5.1 測試方法                          38
5.2 評測結果                          39
第六章  結論與未來展望                40
參考文獻                              41

                                

[1] Birkholz, P., “Articulatory Synthesis of Singing,” Singing Synthesis Challenge 2007 at the Interspeech‘07, Antwerp, pp.4001-4004
[2] Cambell, N., “Conversational Speech Synthesis and Need for Some Laughter,” IEEE Transactions on Audio, Speech, and Language Processing, VOL. 14, NO. 4, July 2006, pp.1171-1178
[3] Cubic Hermite Spline, http://en.wikipedia.org/wiki/Cubic_Hermite_spline
[4] Equal temperament, http://en.wikipedia.org/wiki/Equal_temperament
[5] Hermite Curve Interpolation, http://cubic.org/docs/hermite.htm
[6] Kim, Y. E., “A Framework for Parametric Singing Voice Analysis/Synthesis,” 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 19-22, 2003, New Paltz, NY, pp.123-126
[7] Klatt, D. H. and L. C. Klatt, “Analysis, Synthesis, and Perception of Voice Quality Variations among Female and Male Talkers,” J. Acoust. Soc. Am., Vol. 87, No. 2, February 1990, pp.820-857
[8] Lai, Wen-Hsing, “F0 Control Model for Mandarin Singing Voice Synthesis,” Second International Conference on Digital Telecommunications (ICDT‘07)
[9] Lee, M. E. and M. J. T. Smith, “Digital Singing Voice Synthesis Using a New Alternating Reflection Model,” Circuits and Systems, 2002. ISCAS 2002. IEEE International Symposium on, pp.863-866
[10] Macon, M. W., L. Jensen-Link, J. Oliverio, M. A. Clements, and E. B. George, “A Singing Voice Synthesis System Based on Sinusoidal Modeling,” Acoustics, Speech, and Signal Processing, 1997.ICASSP-97., 1997 IEEE International Conference on, pp.435-438
[11] McClellan, J. H., R. W. Schafer, and M. A. Yoder, “Signal Processing First,” Pearson Prentice Hall, 2003
[12] Meron, Y. and K. Hirose, “Synthesis of Vibrato Singing,” Acoustics, Speech, and Signal Processing, 2000. ICASSP ’00. Proceedings. 2000 IEEE International Conference on, pp.745-748
[13] Nordstrom, K. I., G. A. Rutledge, and P. F. Driessen, “Using Voice Conversion as a Paradigm for Analyzing Breathy Singing Voices,” Communications, Computers and Signal Processing, 2005. PACRIM. 2005 IEEE Pacific Rim Conference on, pp.428-431
[14] O’Shaughnessy, D., “Speech Communications: Human and Machine 2nd edition,” IEEE Press, 2000
[15] Saitou, T., M. Goto, M. Unoki, and M. Akagi, “Speech-to-Singing Synthesis:Converting Speaking Voices to Singing Voices by Controlling Acoustic Features Unique to Singing Voices,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October 21-24, 2007, New Paltz, NY, pp.215-218
[16] Siivola, V., “A Survey of Methods for the Synthesis of the Singing Voice,” Presentation for the course S-89.155, Sound Synthesis, November 19, 2002
[17] Stylianou, Y., “A Simple and Fast Way of Generating a Harmonic Signal,” IEEE Signal Processing Letters, VOL. 7, NO. 5, May 2000, pp.111-113
[18] Zen, H., T. Nose, J. Yamagishi, S. Sako, T. Masuko, A.W. Black, and K. Tokuda, “The HMM-based speech synthesis system version 2.0,” Proc. of ISCA SSW6, Bonn, Germany, Aug. 2007, pp.294-299
[19] 王小川，“語音信號處理”，全華科技圖書股份有限公司，台北，2004
[20] 詹朋翰，“基於FPGA之可變長度快速傅立葉轉換處理器設計”，碩士論文，國立臺灣科技大學電子工程研究所，2005
[21] 盛思豪，“即時歌唱聲合成系統與音樂合成系統之整合”，碩士論文，國立臺灣科技大學電機工程研究所，2002
[22] 陳安璿，“整合MIDI伴奏之歌唱聲合成系統”，碩士論文，國立臺灣科技大學資訊工程研究所，2004
[23] 詹詩涵，“基於音高調節之歌聲合成系統”，碩士論文，國立清華大學資訊系統與應用研究所，2006
[24] 廖皇量，“國語歌聲合成信號品質改進之研究”，碩士論文，國立臺灣科技大學資訊工程研究所，2006
[25] 林正甫，“使用ANN抖音參數模型之國語歌聲合成”，碩士論文，國立臺灣科技大學資訊工程研究所，2008

簡易檢索 / 詳目顯示

相關論文