基於音樂樣本使用卷積神經網路來進行音樂類型分類之研究

簡易檢索 / 詳目顯示

回結果列表

研究生：	葉政隆 Cheng-Lung Yeh
論文名稱：	基於音樂樣本使用卷積神經網路來進行音樂類型分類之研究 Using Convolutional Neural Network for Classifying Music Genre Based on Samples
指導教授：	吳怡樂 Yi-Leh Wu
口試委員:	何瑁鎧 Maw-Kae Hor 閻立剛 Li-Kang Yen 陳建中 Jiann-Jone Chen 唐政元 Cheng-Yuan Tang 吳怡樂 Yi-Leh Wu
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2017
畢業學年度：	105
語文別：	英文
論文頁數：	48
中文關鍵詞：	音樂類型、深度卷積神經網路、取樣樣本
外文關鍵詞：	Music Genre, Convolution Neural Network, Sample
相關次數：	點閱：309 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

近年來的研究已經指出使用卷積神經網路(CNN)來進行圖片分類極為有效，這是因為卷積神經網路是依照資訊的結構性來進行訓練。也有一些研究[1]試著將卷積神經網路應用在非圖片資訊的使用上，想要知道在非圖片領域卷積神經網路能做到什麼地步。本研究會使用卷積神經網路來進行音樂類型的分類，擷取音樂取樣的樣本輸入卷積神經網路進行訓練，觀察卷積神經網路可以從這些較原始、純粹、人眼看似雜亂無章的資訊中學到什麼。最後的實驗結果顯示卷積神經網路確實能從這些人眼看似雜亂無章的圖片中區分音樂類型。

The ability of using convolutional neural network (CNN) for classified image have been proved efficiently in studies recent years. Because CNN is relevant to structural information. But there are few studies [1] using convolutional neural network for non-image classification, trying to explore its limit. This study will use the CNNs for music genre classification. We employ music samples as input to see how the CNNs can learn from these pure and simple features. The experiment results suggest that the CNNs can work on images which seem non-structured to human eyes.

論文摘要    II
Abstract    III
Contents    IV
LIST OF FIGURES    V
LIST OF Tables    VI
Chapter 1.    Introduction    1
Chapter 2.    Deep Learning Model    4
Chapter 3.    Proposed Methods    7
3.1    Music Sample    7
3.2    Music Waveform    10
3.3    Music Spectrum    11
Chapter 4.    Experiment Result    13
4.1    Dataset and Environment    13
4.2    Sample Experiments    13
4.3    Waveform Experiments    24
4.4    Spectrum Experiments    26
4.5    Voting System    26
Chapter 5.    Conclusions and Future work    38
References    39


                                

[1] H. Orii, S. Tsuji, T. Kouda, ”Tactile texture recognition using convolutional neural networks for time-series data of pressure and 6-axis acceleration sensor”, IEEE International Conference on Industrial Technology (ICIT), 2017.
[2] J. Salamon, B. Rocha, E. Gómez, "Musical genre classification using melody features extracted from polyphonic music signals", IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), 2012.
[3] C. H. Lee, C. H. Chou, C. C. Lien, J. C. Fang, “Music genre classification using modulation spectral features and multiple prototype vectors representation”, IEEE 4th International Congress on Image and Signal Processing (CISP), 2011.
[4] D. Pradeep Kumar, B. J. Sowmya, K. G. Srinivasa, “A comparative study of classifiers for music genre classification based on feature extractors”, IEEE Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER), 2016.
[5] K. Hazim, S. Tomas, "Multimodal Genre Classification of TV programs and YouTube Videos", Multimedia Tools and Applications, vol. 63, no. 2, pp. 547-567, 2013.
[6] P. Ahrendt, A. Meng, J. Larsen, “Decision time horizon for music genre classification using short time features”, 12th European Signal Processing Conference, 2004.
[7] J. M. de Sousa, E. T. Pereira, L. R. Veloso, “A robust music genre classification approach for global and regional music datasets evaluation”, IEEE International Conference on Digital Signal Processing (DSP), 2016.
[8] A. B. Chan, A. H. Chun, “Automatic Musical Pattern Feature Extraction Using Convolutional Neural Network”, Proceedings of the International MultiConference of Engineers and Computer Scientists (IMECS), 2010.
[9] G. Tzanetakis, P. Cook, “Musical genre classification of audio signals”, IEEE Transactions on speech and audio processing, pp. 293–302, 2002.
[10] M. Kobayakawa, M. Hoshi, “Musical genre classification of MPEG-4 TwinVQ audio data”, IEEE International Conference on Multimedia and Expo (ICME), 2011.
[11] K. C. Hsu, C. S. Lin, T. S. Chi, “Sparse Coding Based Music Genre Classification Using Spectro-Temporal Modulations”, Proceedings of the 17th ISMIR Conference, 2016.
[12] T. Nakashika, C. Garcia, T. Takiguchi, “Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification”, Interspeech ISCA's 13th Annual Conference, 2012.
[13] W. Zhang, W. Lei, X. Xu, X. Xing, “Improved Music Genre Classification with Convolutional Neural Networks”, Interspeech in San Francisco, 2016.
[14] B. Hua, F. L. Ma, L. C. Jiao, “Research on Computation of GLCM of Image Texture”, Chinese Journal of Electronics, 2006.
[15] GTZAN dataset. http://marsyasweb.appspot.com/download/data_sets/ ,
Referenced on May 18 th, 2017
[16] J. Dai, W. Liu, H. Zheng, W. Xue, C. Ni, “Semi-supervised Learning of Bottleneck Feature for Music Genre Classification”, Chinese Conference on Pattern Recognition (CCPR), pp. 552-562, 2016.
[17] Support vector machine, https://en.wikipedia.org/wiki/Support_vector_machine , Referenced on May 20 th, 2017.
[18] GPU development in recent years,
http://bkultrasound.com/blog/the-next-generation-of-ultrasound-technology , Referenced on May 20 th, 2017.
[19] Typical convolutional neural network architecture,
https://en.wikipedia.org/wiki/Convolutional_neural_network ,
Referenced on May 20 th, 2017.
[20] A. Krizhevsky, I. Sutskever, G. E. Hinton. “Imagenet classification with deep convolutional neural networks”, in Advances in neural information processing systems (pp. 1097-1105), 2012.
[21] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, A. C. Berg. “Imagenet large scale visual recognition challenge” International Journal of Computer Vision, 115(3), 211-252, 2015.
[22] WAVE PCM soundfile format, http://soundfile.sapp.org/doc/WaveFormat/ ,
Referenced on May 20 th, 2017.
[23] Fast Fourier Transform (FFT), https://read01.com/7DA3N4.html ,
Referenced on May 22 th, 2017.
[24] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. B. Girshick, S. Guadarrama, T. Darrell, “Caffe: Convolutional architecture for fast feature embedding”, In Proceedings of the ACM International Conference on Multimedia, pp. 675-678, 2014.

全文公開日期 2022/07/18 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文