Basic Search / Detailed Display

Author: 孫世諺
Shih-yan Sun
Thesis Title: 國語雙字語詞聲調評分方法之研究
A Study on Scoring Methods for Mandarin Tones Uttered in Disyllabic Words.
Advisor: 古鴻炎
Hung-yan Gu
Committee: 陳錫明
Shyi-ming Chen
王新民
Hsin-Min Wang
林伯慎
Bor-shen Lin
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2005
Graduation Academic Year: 94
Language: 中文
Pages: 61
Keywords (in Chinese): 統計倒傳遞神經網路基週軌跡
Keywords (in other languages): statistics, back-propagation neural network, pitch contour
Reference times: Clicks: 441Downloads: 4
Share:
School Collection Retrieve National Library Collection Retrieve Error Report
  • 本論文針對國語雙字詞發音,發展可行的聲調評分方法,評分的處理分為兩個階段,第一階段作基週軌跡分析、音長和音高的正規化,以擷取出特徵參數,第二階段為評分模式,我們研究了類神經網路(ANN)模式以及統計模式,來作五等地評分,即分數值介於1~5分之間。
    所研究的ANN是倒傳遞網路(BPN),實作上有兩種架構:(1)不區分聲調組合;(2)細分聲調組合。此外在統計式評分模式裡,我們實驗了三種距離量測方法和三種音高正規化方法的組合,並且也研究了數種的分數決策方法。
    兩種BPN架構的評分誤差平均值,分別為0.813與0.858分,在統計式評分方法中,使用曼哈頓距離量測和中間分數平均法的決策方法,可以達到評分誤差平均值0.487,如果採取組合式的評分方法時,評分誤差平均值更可以降低至0.472。所以本論文的研究,統計式評分法比BPN評分模型具有較優越的效能,程式評分和人工評分之間的平均誤差值,已可低於0.5個等地分數。


    In this thesis, we study to develop practical scoring methods for Mandarin tones uttered in disyllabic words. The process of scoring is divided into two stages. In the first stage, pitch contour in a utterance is analyzed, and duration and pitch-height are then normalized in order to extract the needed feature parameters. In the second stage, scoring modules are executed to determine scores. We have studied two types of scoring systems, i.e., artificial neural network (ANN) based and statistics based. The range of scores is defined here as from 1 to 5 points.
    The ANN scoring system studied here is constructed in two different BPN structures. One structure is tone-combination undistinguished, and the other is tone combination distinguished. For the statistics based scoring system, three kinds of distance measures and three kinds of pitch-height normalization methods are combined in the tests to study their influences.In addition, several score-decision methods are compared for performance.
    The errors of scoring in the two structures of BPN are in average 0.813 and 0.858 points, respectively. In the scoring system based on statistics, if Manhattan distance measure and middle-grade-mean score- decision methods are used, the error of scoring is only 0.478 points in average. In addition, if a hybrid scoring method designed here is used, the scoring error can be reduced to 0.472 points in average. Therefore, in this study, the effectiveness of the statistics based scoring system is shown to be better than the BPN based scoring system. Also, the error between the scores given respectively by program and human can be reduced to less than 0.5 point in average.

    摘要 I ABSTRACT II 誌謝 III 目錄 IV 圖表索引 VI 第一章 緒論 1 1.1 研究動機及目的 1 1.2 聲調評量方法之相關研究 2 1.2.1 特徵歸納分析法 3 1.2.2 分群法 5 1.3 語音評分方法之相關研究 6 1.4 研究方法 8 1.5 論文架構 9 第二章 語料準備與基週軌跡預處理 11 2.1 語料的準備 11 2.2 人工評分 12 2.3 基週軌跡求取 13 2.3.1 基週量測方法 13 2.3.2 音長正規化 15 2.3.3 關聯式音高正規化 16 2.3.3.1 第一種音高正規化 16 2.3.3.2 第二種音高正規化 18 2.3.4 獨立式音高正規化 18 第三章 類神經網路評量模型 20 3.1 類神經網路簡介 20 3.1.1 神經元 20 3.1.2 多層類神經網路 22 3.2 倒傳遞神經網路 23 3.2.1 網路架構 23 3.2.2 學習法則 25 3.3 評量模型設計 29 3.3.1 不區分聲調組合之BPN模型 29 3.3.2 細分聲調組合之BPN模型 31 第四章 統計式評量方法 33 4.1 距離量測方法 33 4.2 評分方法 36 4.3 迴歸分析 39 第五章 實驗結果與結論 41 5.1 BPN評分模型之實驗 42 5.1.1 不區分聲調組合之實驗 42 5.1.2 細分聲調組合之實驗 44 5.2 統計式評分方法之實驗 45 5.2.1 距離量測和音高正規化的組合實驗 45 5.2.2 分數決策方法的比較 49 5.2.3 組合式評分方法 53 5.2.4 迴歸分析實驗結果 55 5.3 結論 57 參考文獻 59 作者簡介 61

    [1]國立台灣師範大學國音教材編輯委員會,國音學,正中書局。(1982)
    [2]梅永人,國語聲調電腦評量模式之研究,碩士論文,國立台中師範學院教育測試統計研究所。(2000)
    [3]黃重光,以自組織特徵映射建立國語聲調電腦評量模式之研究,碩士論文,國立台中師範學院教育測試統計研究所。(2001)
    [4]L. Neumeyer, H. Franco, M. Weintraub, P. Price, “Automatic Text independent Pronunciation Scoring of Foreign Language Student Speech” , Proc. of ICSLP ‘ 96, pp. 1457-1460, Philadelphia, Pennsylvania. (1996)
    [5]H. Franco, L. Neumeyer, V. Digalakis, and O. Ronen , “Combination of Machine Scores for Automatic Grading of Pronunciation Quality”, Speech Communications, 30(2–3), pp.121–130. (2000)
    [6]李俊毅,語音評分,碩士論文,國立清華大學資訊工程所。(2000)
    [7]Rabiner, L. , M. Cheng, A. Rosenberg and C. McGonegal, “A Comparative Performance Study of Several Pitch Detection Algorithms”, IEEE trans. ASSP, Vol. 24, pp. 399-418, Oct. (1976)
    [8]古鴻炎、張小芬、吳俊欣,“仿趙氏音高尺度之基週軌跡正規化方法及其應用”,第十六屆自然語言與語音處理研討會(台北)。(2004)
    [9]楊仲捷,基於VQ/HMM 之國語語音合成基週軌跡產生之研究,碩士論文,國立台灣科技大學電機工程系。(1999)
    [10]O’Shaughnessy D., Speech Communications : Human and Machine , 2’nd ed., IEEE Press. (2000)
    [11]Jose C. Peinciple, Neil R. Euliano, W, Curt Lefebvre, Neural And Adaptive Systems : Fundamentals Through Simulations , John Wiley & Sons, Inc. (2000)
    [12]周政宏, 神經網路理論與實務, 松崗電腦圖書資料有限公司。(1996)
    [13]葉怡成, 類神經網路模式應用與實作, 儒林圖書有限公司。 (2003)
    [14]Neter,J. ,W. Wasserman, and M. H. Kutner, Applied Linear Statistical Models, 4th ed., McGraw-Hill/Irwin. (1996)
    [15]Ian H. Witten, Eibe Frank , Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann. (1999)
    [16]A. K. Jain and R. C. Dubes, Algorithm for Clustering Data, Prentice-Hall. (1998)

    QR CODE