一種音樂播放演算法基於 Residual-Inception Blocks之音樂情緒分類及生理訊號

簡易檢索 / 詳目顯示

回結果列表

研究生：	廖奕智 Yi-Jr Liao
論文名稱：	一種音樂播放演算法基於 Residual-Inception Blocks之音樂情緒分類及生理訊號 A Music Playback Algorithm Based on Residual-Inception Blocks for Music Emotion Classification and Physiological Information
指導教授：	阮聖彰 Shanq-Jang Ruan
口試委員:	王維君 Wei-Chun Wang 李育豪 Yu-Hao Lee
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2022
畢業學年度：	110
語文別：	英文
論文頁數：	70
中文關鍵詞：	卷積神經網絡、情感分類、深度學習、音樂選擇模塊、生理數據
外文關鍵詞：	convolutional neural networks, emotion classification, deep learning, music selection module, physiological data
相關次數：	點閱：205 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

在過去的一年裡，許多研究人員已經證明了音樂擁有提高運動效率的能力的問題。然而，文獻中關於運動中音樂干預的實際實施的研究非常有限。因此，本文通過考慮音樂情感和生理信號，為慢跑者設計一個播放序列系統。為了使系統能夠長期運行，本文對模型和選擇音樂模塊進行了改進，以達到降低能耗的目的。提出的模型通過使用對數縮放的Mel-spectrogram作為輸入特徵，獲得了較少的FLOPs和參數。我們在4Q情感和Soundtrack數據集上測試了該模型的準確性、計算複雜性、可訓練參數和推理時間。實驗結果表明，所提出的模型在這兩個數據集上的表現優於其他模型。更具體地說，與其他模型相比，所提出的模型降低了計算複雜性和推理時間，同時保持了分類精度。此外，建議的模型用於網絡訓練的尺寸很小，可以應用於手機和其他計算資源有限的設備。本研究通過考慮音樂情感與運動中生理狀況之間的關係，設計了整體的播放序列系統。該播放序列系統可以在運動中直接採用，以提高用戶的效率。

Music can generate a positive effect in runners’ performance and motivation. However, the practical implementation of music intervention during exercise is mostly absent from the literature. Therefore, this paper designs a playback sequence system for joggers by considering music emotion and physiological signals. This playback sequence is implemented by a music selection module that combines artificial intelligence techniques with physiological data and music emotion. In order to make the system operate for a long time, this paper improves the model and selection music module to achieve lower energy consumption. The proposed model obtains fewer FLOPs and parameters by using logarithm scaled Mel-spectrogram as input features. The accuracy, computational complexity, trainable parameters, and inference time are evaluated on the Bi-modal, 4Q emotion, and Soundtrack datasets. The experimental results show that the proposed model is better than that of Sarkar et al. and achieves competitive performance on Bi-modal (84.91%), 4Q emotion (92.04%), and Soundtrack (87.24%) datasets. More specifically, the proposed model reduces the computational complexity and inference time while maintaining the classification accuracy, compared to other models. Moreover, the size of the proposed model for network training is small, which can be applied to mobiles and other devices with limited computing resources. This study designed the overall playback sequence system by considering the relationship between music emotion and physiological situation during exercise. The playback sequence system can be adopted directly during exercise to improve users' exercise efficiency.

Table of Contents
Recommendation Form    I
Committee Form    II
摘要    III
ABSTRACT    IV
Acknowledgments    V
Table of Contents    VII
List of Figures    X
List of Tables    XII
Chapter 1.    1
Introduction    1
1.1 The Necessary of exercise and Research Purpose    2
1.2 The Relationship between Music Stimuli and Physiological    3
1.3 The Music Intervention Applies to The Jogging Field    4
1.4 Comparison with Different Classification Strategy    5
1.5 The flow of Music Playback Sequence Algorithm System    6
1.6 Organization of This Thesis    8
Chapter 2.    9
Related Works    9
2.1 Music intervention during exercise    10
2.2 MECS    12
2.3 Selection Music in Exercise Field    16
Chapter 3.    17
Method    17
3.1 Strategy of Selecting Music    18
3.2 Music Emotion Classification System    24
3.2.1 Residual Connectivity with Inception Block    24
3.2.2 Depthwise-Separable Convolution    27
3.2.3 Network Architecture    29
Chapter 4.    34
Experiment results    34
4.1 Dataset    35
4.2 Data Preprocessing    36
4.3 Data Augmentation    38
4.4 Training and other details    39
4.5 Results on the Bi-modal Dataset    40
4.6 Results on the 4Q Emotion Dataset    42
4.7 Results on Soundtrack Dataset    45
4.8 Runtime of the Developed Application    48
Chapter 5.    49
Conclusions    49
References    51


                                

[1] Lane, A.M.; Lovejoy, D.J. The effects of exercise on mood changes: The moderating effect of depressed mood. J. Sports Med. Phys.Fit. 2001, 41, 539–545.
[2] Warburton, D.E.; Nicol, C.W.; Bredin, S.S. Health benefits of physical activity: The evidence. CMAJ Can. Med. Assoc. J. 2006,174, 801–809.
[3] Pedersen, B.K.; Fischer, C.P. Beneficial health effects of exercise–the role of IL-6 as a myokine. Trends Pharmacol. Sci. 2007,28, 152–156.
[4] Geirsdottir, O.G.; Arnarson, A.; Briem, K.; Ramel, A.; Tomasson, K.; Jonsson, P.; Thorsdottir, I. Physical function predicts improvement in quality of life in elderly Icelanders after 12 weeks of resistance exercise. J. Nutr. Health Aging 2012, 16, 62–66.
[5] Bernardi, L.; Porta, C.; Sleight, P. Cardiovascular, cerebrovascular, and respiratory changes induced by different types of music in musicians and non-musicians: The importance of silence. Heart 2006, 92, 445–452.
[6] Trappe, H.J. The effects of music on the cardiovascular system and cardiovascular health. Heart 2010, 96, 1868–1871.
[7] Bason, P.; Celler, B. Control of the heart rate by external stimuli. Nature 1972, 238, 279–280.
[8] Karageorghis, C.I.; Terry, P.C.; Lane, A.M.; Bishop, D.T.; Priest, D.l. The BASES Expert Statement on use of music in exercise. J.Sports Sci. 2012, 30, 953–956.
[9] Johnson, J.E. The use of music to promote sleep in older women. J. Community Health Nurs. 2003, 20, 27–35.
[10] Cooke, M.; Chaboyer, W.; Schluter, P.; Hiratos, M. The effect of music on preoperative anxiety in day surgery. J. Adv. Nurs. 2005,52, 47–55.
[11] Karow, M.C.; Rogers, R.R.; Pederson, J.A.; Williams, T.D.; Marshall, M.R.; Ballmann, C.G. Effects of preferred and nonpreferred warm-up music on exercise performance. Percept. Mot. Skills 2020, 127, 912–924.
[12] Wijnalda, G.; Pauws, S.; Vignoli, F.; Stuckenschmidt, H. A personalized music system for motivation in sport performance. IEEE Pervasive Comput. 2005, 4, 26–32.
[13] Van Dyck, E.; Moens, B.; Buhmann, J.; Demey, M.; Coorevits, E.; Dalla Bella, S.; Leman, M. Spontaneous entrainment of running cadence to music tempo. Sports Med.-Open 2015, 1, 15.
[14] Gallego, M.G.; García, J.G. Music therapy and Alzheimer’s disease: Cognitive, psychological, and behavioural effects. Neurología 2017, 32, 300–308.
[15] Cheng, J.C.; Chiu, C.Y.; Su, T.J. Training and evaluation of human cardiorespiratory endurance based on a fuzzy algorithm. Int. J. Environ. Res. Public Health 2019, 16, 2390.
[16] Pao, T.L.; Chen, Y.T.; Yeh, J.H.; Cheng, Y.M.; Lin, Y.Y. A comparative study of different weighting schemes on KNN-based emotion recognition in Mandarin speech. In International Conference on Intelligent Computing; Springer: Berlin/Heidelberg, Germany, 2007; pp. 997–1005.
[17] Yadav, A.; Vishwakarma, D.K. A multilingual framework of CNN and bi-LSTM for emotion classification. In Proceedings of the 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT), Kharagpur, India, 1–3 July 2020; pp. 1–6.
[18] Szmedra, L.; Bacharach, D. Effect of music on perceived exertion, plasma lactate, norepinephrine and cardiovascular hemody-namics during treadmill running. Int. J. Sports Med. 1998, 19, 32–37.
[19] Atan, T. Effect of music on anaerobic exercise performance. Biol. Sport 2013, 30, 35.
[20] Karageorghis, C.I.; Priest, D.L. Music in the exercise domain: A review and synthesis (Part I). Int. Rev. Sport Exerc. Psychol. 2012, 5, 44–66.
[21] Karageorghis, C.I. Applying Music in Exercise and Sport; Human Kinetics: Champaign, IL, USA, 2016.
[22] Atkinson, G.; Wilson, D.; Eubank, M. Effects of music on work-rate distribution during a cycling time trial. Int. J. Sports Med. 2004, 25, 611–615.
[23] Edworthy, J.; Waring, H. The effects of music tempo and loudness level on treadmill exercise. Ergonomics 2006, 49, 1597–1610.
[24] Yamashita, S.; Iwai, K.; Akimoto, T.; Sugawara, J.; Kono, I. Effects of music during exercise on RPE, heart rate and the autonomic nervous system. J. Sports Med. Phys. Fit. 2006, 46, 425.
[25] Schücker, L.; Hagemann, N.; Strauss, B.; Völker, K. The effect of attentional focus on running economy. J. Sports Sci. 2009, 27, 1241–1248.
[26] Carmichael, K.E.; Marshall, D.N.; Roche, B.M.; Olson, R.L. Effects of music on arousal, affect, and mood following moderate-intensity cycling. Int. J. Exerc. Sci. Conf. Proc. 2018, 2, 91.
[27] Nikol, L.; Kuan, G.; Ong, M.; Chang, Y.K.; Terry, P.C. The heat is on: Effects of synchronous music on psychophysiological parameters and running performance in hot and humid conditions. Front. Psychol. 2018, 9, 1114.
[28] Waterhouse, J.; Hudson, P.; Edwards, B. Effects of music tempo upon submaximal cycling performance. Scand. J. Med. Sci. Sports 2010, 20, 662–669.
[29] Terry, P.C.; Karageorghis, C.I. Music in Sport and Exercise. 2011. Available online: https://eprints.usq.edu.au/19163/ (accessed on 13. Apr. 2020).
[30] Moss, S.L.; Enright, K.; Cushman, S. The influence of music genre on explosive power, repetitions to failure and mood responses during resistance exercise. Psychol. Sport Exerc. 2018, 37, 128–138.
[31] Borg, G.A. Psychophysical bases of perceived exertion. Med. Sci. Sports Exerc. 1982, 14, 377–381.
[32] Maddigan, M.E.; Sullivan, K.M.; Halperin, I.; Basset, F.A.; Behm, D.G. High tempo music prolongs high intensity exercise. PeerJ 2019, 6, e6164.
[33] Liu, X.; Chen, Q.; Wu, X.; Liu, Y.; Liu, Y. CNN based music emotion classification. arXiv 2017, arXiv:1704.05665.
[34] Er, M.B.; Aydilek, I.B. Music emotion recognition by using chroma spectrogram and deep visual features. Int. J. Comput. Intel. Syst. 2019, 12, 1622–1634.
[35] Hizlisoy, S.; Yildirim, S.; Tufekci, Z. Music emotion recognition using convolutional long short term memory deep neural networks. Eng. Sci. Technol. Int. J. 2021, 24, 760–767.
[36] Russell, J.A. A circumplex model of affect. J. Personal. Soc. Psychol. 1980, 39, 1161.
[37] Kuppens, P.; Tuerlinckx, F.; Yik, M.; Koval, P.; Coosemans, J.; Zeng, K.J.; Russell, J.A. The relation between valence and arousal in subjective experience varies with personality and culture. J. Personal. 2017, 85, 530–542.
[38] Uzkent, B.; Barkana, B.D.; Cevikalp, H. Non-speech environmental sound classification using SVMs with a new set of features. Int. J. Innov. Comput. Inf. Control 2012, 8, 3511–3524.
[39] Stowell, D.; Giannoulis, D.; Benetos, E.; Lagrange, M.; Plumbley, M.D. Detection and classification of acoustic scenes and events. IEEE Trans. Multimed. 2015, 17, 1733–1746.
[40] Mashhadi, Z.; Saadati, H.; Dadkhah, M. Investigating the Putative Mechanisms Mediating the Beneficial Effects of Exercise on the Brain and Cognitive Function. Int. J. Med Rev. 2021, 8, 45–56.
[41] Lee, H.; Pham, P.; Largman, Y.; Ng, A. Unsupervised feature learning for audio classification using convolutional deep belief networks. Adv. Neural Inf. Process. Syst. 2009, 22, 1096–1104.
[42] Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861.
[43] Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017.
[44] Panda, R.; Malheiro, R.; Paiva, R.P. Musical texture and expressivity features for music emotion recognition. In Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR 2018), Paris, France, 23–27 September 2018; pp. 383–391.
[45] Eerola, T.; Vuoskoski, J.K. A comparison of the discrete and dimensional models of emotion in music. Psychol. Music 2011,39, 18–49.
[46] Elliott, G.T.; Tomlinson, B. PersonalSoundtrack: Context-aware playlists that adapt to user pace. In Proceedings of the CHI’06 Extended Abstracts on Human Factors in Computing Systems, Montréal, QC, Canada, 22–27 April 2006; pp. 736–741.
[47] De Oliveira, R.; Oliver, N. TripleBeat: Enhancing exercise performance with persuasion. In Proceedings of the 10th International Conference on Human Computer Interaction with Mobile Devices and Services, Amsterdam, The Netherlands, 2–5 September 2008; pp. 255–264.
[48] Khushhal, A.; Nichols, S.; Evans, W.; Gleadall-Siddall, D.O.; Page, R.; O’Doherty, A.F.; Carroll, S.; Ingle, L.; Abt, G. Validity and reliability of the Apple Watch for measuring heart rate during exercise. Sports Med. Int. Open 2017, 1, E206–E211.
[49] Chiu, M.C.; Ko, L.W. Develop a personalized intelligent music selection system based on heart rate variability and machine learning. Multimed. Tools Appl. 2017, 76, 15607–15639.
[50] Malik, M. Standard measurement of heart rate variability. In Dynamic Electrocardiography; Wiley, New York 2008; pp. 13–21.
[51] Medicore. SA-3000P Clinical Manual Version 3.0. Retrieved: 8 June 8 2015. 2015. Available online: https://therisingsea.org/notes/FoundationsForCategoryTheory.pdf (accessed on 17. Jul 2020).
[52] He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778.
[53] Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 448–456.
[54] Sarkar, R.; Choudhury, S.; Dutta, S.; Roy, A.; Saha, S.K. Recognition of emotion in music based on deep convolutional neural network. Multimed. Tools Appl. 2020, 79, 765–783.
[55] Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv 2013, arXiv:1312.4400.
[56] Han, Y.; Lee, K. Convolutional neural network with multiple-width frequency-delta data augmentation for acoustic scene classification. In Proceedings of the IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events, Budapest, Hungary, 3 September 2016.
[57] McFee, B.; Raffel, C.; Liang, D.; Ellis, D.P.; McVicar, M.; Battenberg, E.; Nieto, O. librosa: Audio and music signal analysis in python. In Proceedings of the 14th Python in Science Conference, Austin, TX, USA, 6–12 July 2015; Volume 8, pp. 18–25.
[58] Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980.
[59] Chaudhary, D.; Singh, N.P.; Singh, S. Development of music emotion classification system using convolution neural network. Int. J. Speech Technol. 2021, 24, 571–580.
[60] Saari, P.; Eerola, T.; Lartillot, O. Generalizability and simplicity as criteria in feature selection: Application to mood classification in music. IEEE Trans. Audio Speech Lang. Process. 2010, 19, 1802–1812.
[61] Chen, N.; Wang, S. High-Level Music Descriptor Extraction Algorithm Based on Combination of Multi-Channel CNNs and LSTM. In Proceedings of the ISMIR, Suzhou, China, 23–27 October 2017; pp. 509–514.

全文公開日期 2024/07/06 (校內網路)
全文公開日期 2024/07/06 (校外網路)
全文公開日期 2024/07/06 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文