簡易檢索 / 詳目顯示

研究生: 林承億
Cheng-Yi Lin
論文名稱: 獨立成分分析法應用於青蛙聲音辨識
The Study of Applying Independent Component Analysis to Frog Voice Recognition
指導教授: 楊英魁
Ying-Kuei Yang
口試委員: 楊英魁
Ying-Kuei Yang
張博綸
Bo-Lun Zhang
黎碧煌
Bih-Hwang Lee
李建南
Jian-Nan Li
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2017
畢業學年度: 106
語文別: 中文
論文頁數: 61
中文關鍵詞: 青蛙獨立成分分析聲音辨識音節切割特徵值擷取梅爾倒頻譜係數機器學習支援向量機
外文關鍵詞: Frog, Independent Component Analysis, Voice Recognition, Syllable Segmentation, Feature Extraction, Mel-Frequency Cepstral Coefficients, Machine Learning, Support Vector Machine
相關次數: 點閱:427下載:10
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

在野外進行生物探查時,常用的聲音辨識來確認一個地區的生物種類及數量,而進行野外收音時,會同時紀錄不同種類的生物叫聲,其會增加聲音辨識的難度。
本論文之目的為將不同種類及數量之青蛙叫聲的混合訊號,加以分離出各個組成成分之青蛙叫聲,也稱作成分訊號。藉由運用獨立成分分析法(independent component analysis)將成分訊號分離出來後,透過以支援向量機(support vector machine)為基礎建立之聲音模型來預測各個聲音所屬的青蛙種類及其個數。以AmphibiaWeb提供的青蛙音檔作為資料庫,並選出6種種類的青蛙進行實驗。
就分類而言,會先對青蛙的叫聲進行音節切割,接著以梅爾倒頻譜係數(mel-frequency cepstral coefficients, MFCC)來做為各個音節之特徵值,並將其運用於聲音模型的訓練及測試。
本論文將探討三種混合訊號經獨立成分分析法所得到的成分訊號可能會遇到的結果,並以各個成分訊號音節的MFCC為依據,提出三種特徵值的擷取方法來達到提升青蛙叫聲的辨識能力。最後將此三種特徵值方法應用於獨立成分分析法後所遇到的三種結果,並比較三種擷取方法的辨識能力。
其中第一種特徵值擷取方法,取各個音節之MFCC的平均作為分訊號的特徵值,其辨識率為59.17%;第二種特徵值擷取方法,取各個音節之MFCC並捨去成分訊號中種類數量比例不到20%的音節,根據剩餘各個音節之重要性給予權重,其辨識率為70%;而第三種方法特徵值擷取方法,取各個音節之MFCC且保留所有成分訊號來做預測,根據各個音節之重要性給予權重,其辨識率為71.12%。依測試結果來觀察,本論文提出的第三種特徵值擷取方法有最好的辨識能力。


Automatic recognition of frog sounds is considered a worthy tool for biological research and environmental monitoring.
When researchers do field research with voice recordings, however, it’s often difficult to get a recording which only contains the voice of a single species. These recordings are called mixed signals, and they usually consist of a variety of different animal voices. Each individual animal voice is called a component signal, and depending on the component signals within a mixed signal, sound recognition can be more difficult.
Independent Components Analysis is the technique used to separate a mixed signal into its individual component signals. After separating a mixed signal into its component signals, Support Vector Machine is employed to classify the frog sounds by species.
In this study, all the frog sounds are retrieved from Amphibia Web.
Before sound classification, the demixed samples are first properly segmented into syllables, and then its features are extracted through the Mel-Frequency Cepstral Coefficients from each of the syllables.
This study offers 3 feature extraction methods. The experimental results show that the first method (averaging MFCC from all the syllables) is capable of identifying the frog species with an accuracy of 59.17%. The second method (adding weight on MFCC according to the proportion and ignoring the syllable which is below 20% amount percentage) has an accuracy rate of 70%. The third method (same as the 2nd method but keeping all the syllables) achieves an accuracy of 71.2%. According to the results, the third method is the best way for feature extraction.

摘要 abstract 致謝 目錄 第一章 緒論 第二章 文獻探討 第三章 研究方法 第五章 實驗數據總結 參考文獻

[1] 姜博仁、蔡哲民、蔡世超、吳禎祺、鄭薏如,錄音技術應用於野生動物調查之應用與評估,臺灣林業,41,第33-38頁,2005。
[2] Marssi Draw, ”青蛙消失和我有什麼關係?” TEDxTaipei, July. 2014.
[3] Haryati Jaafar, Dzati Athiar Ramli, “Frog Sound Identification System Based On Automatic Syllables Segmentation,” 2013 IEEE 9th International Colloquium on Signal Processing and its Applications, pp. 224–228, 2013.
[4] Aki Harma, “Automatic identification of bird species based on sinusoidal modeling of syllables,” International Conference on Acoustics, Speech, and Signal Processing, 5, pp.545-548, 2003.
[5] Chang-Hsing Lee, Chih-Hsun Chou, Chin-Chuan Han, Ren-Zhuang Huang, "Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis," Pattern Recognition Letters, VOL. 27(2), pp. 93-101, 2006.
[6] Zunjing Wu, Zhigang Cao, “Improved MFCC-based feature for robust speaker identification,” Tsinghua Science and Technology, VOL. 10(2), pp. 158 - 161, 2005.
[7] Chenn-Jung Huang, Yi-Ju Yang, Dian-Xiu Yang, You-Jia Chen, “Frog classification using machine learning techniques,” Expert Systems with Applications, Elsevier, pp.3737-3743, 2009
[8] Clifford Loh Ting Yuan, Dzati Athiar Ramli, “Frog Sound Identification System for Frog Species Recognition,” 1st International Conference on Context-Aware Systems and Applications (ICCASA), Vol. 109, pp. 42-50, 2012.
[9] Roger Jang, “Audio Signal Processing and Recognition,” Available:
http://mirlab.org/jang/books/audiosignalprocessing/speechFeatureMfcc_chinese.asp?title=12-2%20MFCC.
[10] 王小川,語音訊號處理,全華圖書,2004,台北市。
[11] Lie Lu, Hong-Jiang Zhang, Hao Jiang, “Content analysis for audio classification and segmentation,” IEEE Transactions on Speech and Audio Processing, VOL. 10(7), pp. 504 - 516, 2002.
[12] Corinna Cores, Vladimir Vapnik, “Support-vector network. Machine Learning,” 20:273-297, 1995.
[13] Aravind Ganapathiraju, Jonathan E. Hamaker, Joseph Picone, “Applications of support vector machines to speech recognition,” IEEE Transactions on Signal Processing, VOL. 52(8), pp. 2348 – 2355, 2004.
[14] Nello Cristianini, John Shawe-Taylor, An introduction to support vector machines and other kernel-based learning methods, Cambridge University Press, 2000.
[15] 林宗勳,”Support Vector Machines 簡介,”
Available:http://www.cmlab.csie.ntu.edu.tw/~cyy/learning/tutorials/SVM2.pdf.
[16] David M W Powers, "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation," Journal of Machine Learning Technologies, VOL. 2 (1), pp. 37–63, 2007
[17] Hervé Abdi, Lynne J Williams, “Principal component analysis,” Wiley Interdisciplinary Reviews: Computational Statistics, pp. 433–459, 2010.
[18] Aapo Hyvärinen, Juha Karhunen, Erkki Oja, Independent Component Analysis, Wiley, New York, 2001
[19] Luis Borges de Almeida, “Linear and Nonlinear ICA Based on Mutual Information,” Adaptive Systems for Signal Processing, VOL. 84(2), Canada, 2000.
[20] Harold William Kuhn, Albert William Tucker, Nonlinear programming. Proceedings of 2nd Berkeley Symposium, Berkeley: University of California Press, 481–492. 1951.
[21] Ji Ce, Yu Yang, Yu Peng, “A New FastICA Algorithm of Newton's Iteration,” International Conference on Education Technology and Computer, June. 2010.
[22] Berkeley, California, “AmphibiaWeb,” Available:http://amphibiaweb.org/

QR CODE