哼唱式音樂搜尋系統中音符偵測與比對方法之研究｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃信榮 Hsin-Jung Huang
論文名稱：	哼唱式音樂搜尋系統中音符偵測與比對方法之研究 A Study on Note Detection and Melody Matching Method for Query By Singing/Humming System
指導教授：	林伯慎 Bor-Shen Lin
口試委員:	王新民 Hsin-Min Wang 古鴻炎 Hung-Yan Gu
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理系 Department of Information Management
論文出版年：	2009
畢業學年度：	97
語文別：	中文
論文頁數：	82
中文關鍵詞：	以內容為基礎的音樂檢索、旋律搜尋、音符偵測、哼唱式搜尋、起音點偵測
外文關鍵詞：	Onset Detection, Content-Based Music Information Retrieval, Melody Search, Note Detection, Query By Singing/Humming
相關次數：	點閱：211 下載：1
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

在哼唱式音樂搜尋系統中，對人聲哼唱的起音點偵測效能的好壞，左右著音符切割準確率，進而影響著哼唱搜尋的效能。哼唱信號的起音點偵測一直都是困難的問題，目前的偵測效能仍然不高。本論文的目的，即是希望能改進目前在哼唱信號的起音點偵測效能，再利用改進後的起音點偵測方法來增進音符切割的準確性，進一步地提升哼唱式搜尋的效能。在起音點的偵測上，我們提出移動平均濾波的概念，並且應用於現有的起音點偵測函數中，其實驗結果，能有效的改進原有起音點偵測函數的效能。在起音點的決策上，我們改進原有單一決策特徵來進行決策的方法，使用多個決策特徵值，透過高斯混合模型進行訓練起音點模型，並利用此模型來制定起音點的決策。在實驗的結果能有效地提升哼唱人聲哼唱的起音點偵測效能。在公開的人聲哼唱資料庫(QBSH) 上測試可以達到77.4%的F度量(F-measure)值、相當於76.9%的召回率(recall)與77.7%的精確率(precision)。
最後利用此起音點偵測架構套用於哼唱式搜尋系統中，結合我們的旋律比對系統。實驗結果能將原有哼唱式搜尋系統效能從MRR值0.53提升至0.56，前15名入選率從67%提升至70%，表示改進起音點偵測方法能有效地提升哼唱式搜尋的效能。

Onset detection for singing voices is an important but difficult problem for note detection in query by singing/humming or music transcription. The purpose of this paper is to improve the performance of onset detection for singing/humming voice. This paper proposes an onset detection scheme which utilizes the moving average filtering in detection function to accentuate the uprising margins, while making use of discriminative classifier based on Gaussian mixture models to combine relevant features of adjacent peaks in final decision. Experimental results show that the onset detection scheme can improve the detection performance significantly, and achieve 77.7% of precision rate and 76.9% of recall rate at 77.4% of F-measure.
This onset detection scheme was further combined with the query by singing/humming system, and experimental results show that, the onset detection to detect note can effectively improve the performance of music search. The MRR value can be increased from 0.53 to 0.56 and increase the top-15 hit rate from 67% to 70% when onset detection is applied to the note detection.

第一章	序論	1
1     研究動機	1
2     背景簡介	2
3     論文目的與成果簡介	3
4     論文組織及架構	4
第二章	背景技術簡介	5
1     音高追踪(pitch tracking)	5
2     起音點偵測(onset detection)	6
3     起音點偵測流程	10
4     偵測函數	12
5     起音點偵測效能的度量指標	16
6     旋律比對方法	19
7     哼唱式音樂搜尋效能的度量指標	22
8     本章摘要	24
第三章	應用移動平均濾波於起音點偵測	25
1     起音點偵測架構	25
2     移動平均濾波技術	28
3     偵測函數比較	34
4     使用移動平均濾波於偵測函數的比較	39
5     本章摘要	43
第四章　使用高斯混合模型於起音點決策	44
1     門檻測試法	44
2     決策特徵值	45
3     高斯混合模型分類器	47
4     使用高斯混合模型於決策程序效能比較	50
5     本章摘要	53
第五章　應用起音點偵測於哼唱式搜尋系統	54
1     哼唱式音樂搜尋架構	54
2     音符切割及旋律比對	58
3     哼唱式音樂搜尋系統的效能分析	61
4     本章摘要	68
第六章　結論及未來研究方向	69
1     結論	69
2     未來研究方向	70
參考文獻	71

                                

[1] MIREX 2007 Audio Onset Detection Results: Solo Singing Voice, http://www.music-ir.org/mirex/2007/.
[2] Arturo Camacho, “SWIPE: A Sawtooth Waveform Inspired Pitch Estimator for Speech and Music”, PhD. Thesis, University of Florida (2007).
[3] L.R. Rabiner “On the Use of Autocorrelation Analysis for pitch Detection”, IEEE Transactions on Acoustic, Speech and Signal Processing (1977).
[4] M. J. Ross, H.L. Shaffer, A. Cohen, R. Freudberg, and H. J. Manley “Average Magnitude Difference Function Pitch Extractor”, IEEE Transactions on Acoustics, Speech and Signal Processing (1974).
[5] M. R. Schroeder “Period Histogram and Product Spectrum: New Methods for Fundamental Frequency Measurement”, Journal of the Acoustical Society of America (1968).
[6] A. M. Noll “Cepstrum Pitch Determination”, Journal of the Acoustical Society of America (1967).
[7] G. Poliner , D. Ellis , A. Ehmann , E. Gomez , S. Streich and B. Ong “Melody transcription from music audio: Approaches and evaluation,” IEEE Trans. Audio, Speech, Lang. Process. (2007).
[8] B. Pardo, W. P. Birmingham, and J. Shifrin, "Name thatTune: A Pilot Studying in Finding a Melody from a SungQuery," Journal of the American Society for Information Science and Technology (2004).
[9] A. Ghias, J. Logan, D. Chamberlin and B. C. Smith, “Query by humming: Musical information retrieval in an audio database” In Proc ACM Int'l Conf on Multimedia, ACM (1995).
[10] D. Mazzoni, R. B. Dannenberg “Melody Matching Directly From Audio.”, 2nd Annual International Symposium on Music Information Retrieval (2001).
[11] Prechelt, L., Typke, R. “An Interface for Melody Input” ACM Trans. on Computer-Human Interaction (2001)
[12] N. Collins: “A Comparison of Sound Onset Detection Algorithms with Emphasis on Psycho-acoustically Motivated Detection Functions”, Proceedings of AES118 Convention (2005).
[13] Paul Masri and Andrew Bateman: “Improved Modeling of Attack Transients in Music Analysis-resynthesis” Proceedings of International Computer Music Conference (1996).
[14] Kristoffer Jensen and Tue Haste Andersen. “Real-time beat estimation using feature extraction.” , In Proc. Computer MusicModeling and Retrieval Symposium, Lecture Notes in Computer Science. Springer Verlag (2003).
[15] J. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M. Sandler “A Tutorial on Onset Detection in Music Signals” IEEE Transactions on Speech and Audio Processing, volume 13(5), September (2005).
[16] C. Duxbury, M. Sandler, M. Davies: “A Hybrid Approach to Note Onset Detection”, Proceeding of the 5th Int. Conference on Digital Audio Effects (DAFx-02), Hamburg, Germany (2002).
[17]B. Pardo, W. P. Birmingham. “Encoding Timing Information for Musical Query Matching.”, ISMIR 2002, 3rd International Conference on Music Information Retrieval. IRCAM (2002).
[18] E.J. Keogh and M.J. Pazzani. “Derivative dynamic time warping.”, In SIAM International Conference on Data Mining (2001).
[19] Jyh-Shing Roger Jang, "QBSH: A Corpus for Designing QBSH (Query by Singing/Humming) Systems", available at the "QBSH corpus for query by singing/humming" link at the organizer's homepage at http://www.cs.nthu.edu.tw/~jang.
[20]Pao-Chung Chang and Biing-Hwang Juang., “Discriminative Training of Dynamic Programming Based Speech Recognizers”, IEEE Transaction on Speech and Audio Processing, Vol. 1 (1993).
[21]Shigeru Katagirl, Bing-Hwang Juang and Chin-Hui Lee., “Pattern Recognition Using a Family of Design Algorithms Based Upon the Generalized Probabilistic Descent Method”, Proceedings of The IEEE, Vol. 86 (1998).

簡易檢索 / 詳目顯示

相關論文