研究生: |
劉名祐 Ming-Yu - Liu |
---|---|
論文名稱: |
利用混合式機器學習演算法之音樂情緒辨識系統 Music Emotion Recognition System Using Hybrid Machine Learning Algorithms |
指導教授: |
林敬舜
Ching-Shun Lin |
口試委員: |
陳維美
Wei-Mei Chen 林昌鴻 Chang-Hong Lin 王煥宗 Huan-Chun Wang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電子工程系 Department of Electronic and Computer Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 63 |
中文關鍵詞: | 音樂情緒辨識 、特徵擷取 、泰勒情緒平面 、支持向量機 、深度信念網路 、正規化代數乘法 |
外文關鍵詞: | Music emotion recognition, Feature extract, Thayer's arousal-valence plane, Support vector machine, Deep belief network, Normalized algebraic product |
相關次數: | 點閱:352 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
音樂情緒辨識(Music Emotion Recognition - MER)分析了音樂與人之間的關係,音樂情緒辨識在音樂理解、音樂檢索等其他相關應用是有幫助的。隨著音樂數量的增加,透過情緒來挑選音樂的需求也跟著出現。音樂情緒辨識需要考慮到音樂心理學的特性,雖然音樂情緒辨識已經發展了一段時間,但仍未有一個非常準確的音樂情緒辨識系統。
在本論文中,我們提出基於兩種音樂格式的音樂情緒辨識系統,兩個音樂格式分別使用不同的機器學習模型,本系統包括了數位訊號音樂情緒辨識、MIDI音樂情緒辨識及決策模型三個部份;在數位訊號擷取WAVE的37個特徵,並透過RreliefF演算法計算每一個特徵權重,並根據權重的高低依序帶入支持向量機(Support Vector Machine, SVM)分類情緒,並觀察辨識結果;MIDI音樂情緒辨識部份的特徵則包括單位時間特徵、全域特徵及樂器特徵,其使用的機器學習模型為深度信念網路(Deep Belief Network, DBN);最後,我們使用正規化代數乘法(Normalized Algebraic Product, NAP)整合上述兩者的辨識結果。
Music emotion recognition (MER) detects and analyzes the relation between human emotion and music clips. MER is helpful in music understanding, music retrieval, and other music-related applications. As volume of online musical contents expands rapidly in recent years, demands for retrieval by emotion have also been emerging. MER needs to take into the characteristics of music psychology into consideration. Although MER has been developed for years, there is currently no well-developed emotion model for music emotion representation.
In this thesis, we propose a music emotion recognition system which is based on two music formats with corresponding machine learning models. More specifically, this system includes WAVE based MER, MIDI based MER and a decision model. WAVE based MER extracts 37 features from wave files and calculates weight of each feature by RreliefF. The selected features sent to support vector machine (SVM) for training are according to sorted weights. The training data for MIDI based MER classifier, deep belief network (DBN), include time dependent and instrument features. Moreover, we also introduce the normalized algebraic product (NAP) as the decision maker for integrating the recognition from both classifiers.
[1] 蔡振家,音樂認知心理學,臺大出版中心,2013。
[2] T. Li and M. Ogihara, "Content-based music similarity search and emotion detection," International Conference on Acoustics Speech and Signal Processing, vol. 5, pp. 17-21, 2004.
[3] C. S. Lin, M. Y. Liu, W. W. Hsiung, and J. S. Jhang, "Music emotion recognition based on two-level support vector classification," International Conference on Machine Learning and Cybernetics, pp. 375-379, Jul. 2016.
[4] R. E. Thayer, The Biopsychology of Mood and Arousal, Oxford University Press, 1989.
[5] S. H. Chen, Y. S. Lee, W.C. Hsieh and J. C. Wang, "Music emotion recognition using deep Gaussian process," Association Annual Summit and Conference, pp. 495-498, 2015.
[6] Y. H. Yang, Y. C. Lin, Y. F. Su and H. H. Chen, "Music emotion classification : A regression approach," International Conference on Multimedia and Expo, pp. 208-211, 2007.
[7] Y. L. Deng, Y. Y. Lv, M. L. Liu and Q. Y. Lu, "A regression approach to categorical music emotion recognition," Progress in Informatics and Computing, pp. 257-261, 2015
[8] X. Wang, Y. Q. Wu, X. O. Chen and D. S. Yang, "A two-layer model for music pleasure regression," Multimedia and Expo Workshops, pp. 1-6, 2013
[9] Musicovery, < http://musicovery.com/ >, accessed in December 21st, 2016.
[10] G. Lise and I. Peretz, "Mode and tempo relative contributions to "happy-sad" judgments in equitone melodies," Cognition and Emotion, vol. 17, no. 1, pp. 25-40, 2003.
[11] J. Foote and S. Uchihashi, "The beat spectrum: A new approach to rhythm analysis," International Conference on Multimedia and Expo, pp. 881-884, Aug. 2001.
[12] J. P. Bello, L. Daudet, S. Abdallah, and C. Duxbury, "A tutorial on onset detection in music signal," IEEE Transactions Speech and Audio Processing, vol. 13, no. 5, Sep. 2005.
[13] G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Transactions Speech and Audio Processing, vol. 10, no. 5, pp. 293-302, Jul. 2002.
[14] E. Scheirer and M. Slaney, "Construction and evaluation of a robust multifeature speech/music discriminator," IEEE International Conference on Acoustics Speech and Signal Processing, vol. 2, pp. 1331-1334, Apr. 1997.
[15] L. V. Immerseel and J. Marten, "Pitch and voice/unvoiced determination with an auditory model," The Journal of the Acoustical Society of America, vol. 91, no. 6, pp. 3511-3526, Jun. 1992.
[16] M. Leman, "Visualization and calculation of the roughness of acoustical musical signals using the synchronization index model(SIM)," COST-G6 Conference on Digital Audio Effects, pp. 125-130, Dec. 2000.
[17] M. A. Bartsch and G. H. Wakefield, "Audio thumbnailing of popular music using chroma-based representation," IEEE Transactions on Multimedia, vol. 7, no. 1, pp. 96-104, Feb. 2005.
[18] Y. H. Yang and H. H. Chen, Music Emotion Recognition, CRC Press, 2011.
[19] K. Kira and L. A. Rendell, "A practical approach to feature selection," Machine learning, pp. 249-256, 1992.
[20] I. Kononenko, "Estimating attributes: analysis and extensions of Relief," European Conference on Machine Learning, vol. 784, pp. 171-182, Jan. 1994.
[21] M. Robnik Sikonja and I. Kononenko, "An adaptation of Relief for attribute estimation in regression," International Conference on Machine Learning, pp. 296-304, 1997.
[22] H. Liu and H. Motoda, Computational Methods of Feature Selection, Chapman & Hall/CRC, 2007.
[23] M. R. Sikonja and I. Kononenko, "Theoretical and Empirical Analysis of ReliefF and RReliefF," Machine Learning Journal, vol. 53, no. 23, pp. 23-69, Oct. 2003.
[24] B. E. Boser, I. M. Guyon, and V. N. Vapnik, "A training algorithm for optimal margin classifiers," Computational Learning Theory, pp. 144-152 , 1992.