簡易檢索 / 詳目顯示

研究生: 丁柏容
Po-Jung Ting
論文名稱: 一種用於助聽器之基於Inception-Dense Blocks的環境噪音分類模型
An Environmental Noise Classification Model Based on Inception-Dense Blocks for Hearing Aids
指導教授: 阮聖彰
Shanq-Jang Ruan
口試委員: 阮聖彰
Shanq-Jang Ruan
呂政修
Jenq-Shiou Leu
力博宏
Lieber Po-Hung Li
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 71
中文關鍵詞: 助聽器環境聲音分類深度學習卷積神經網路
外文關鍵詞: Hearing aids, Environmental noise classification, Deep learning, Convolutional neural networks
相關次數: 點閱:285下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 對於聽力受損的人,助聽器已經成為他們生活中不可或缺的必需品,在助聽器中,環境噪音分類和估計是重要的技術,然而有些環境噪音分類器使用多種音頻特徵當作分類器的輸入特徵,導致環境噪音分類器計算量增加,除此之外,環境噪音分類器如果採用不同的時間長度作為輸入特徵,可能會影響噪音分類的效果。因此,本論文提出了一種用於環境噪音分類的深度學習模型,並對三種不同長度的聲音片段進行實驗,本文所提出的模型擁有更少的浮點數運算及參數,並只使用對數縮放的梅爾譜圖作為輸入特徵。我們採用UrbanSound8K資料集和HANS資料集對提出的模型進行分類準確度、模型計算複雜度、模型參數和模型推理時間的評估。實驗結果表明,我們提出的模型在兩個資料集上優於其他模型,與其他模型相比,所提出的模型在保持分類準確度的同時降低了模型複雜度和推理時間。因此,本文提出針對助聽器的噪音分類器在計算複雜度降低的情況下並不會影響噪音分類器的性能。


    Hearing aids are increasingly essential for people with hearing loss. For this purpose, environmental noise estimation and classification are some of the required technologies. However, some noise classifiers utilize multiple audio features, which cause intense computation. In addition, such noise classifiers employ inputs of different time lengths, which may affect classification performance. Thus, this thesis proposes a model architecture for noise classification, and performs experiments with three different audio segment time lengths. The proposed model attains fewer floating-point operations and parameters by utilizing the log-scaled mel-spectrogram as an input feature. The proposed models are evaluated with classification accuracy, computational complexity, trainable parameters, and inference time on the UrbanSound8K dataset and HANS dataset. The experimental results showed that the proposed model outperforms other models on two datasets. Furthermore, compared with other models, the proposed model reduces model complexity and inference time while maintaining classification accuracy. As a result, the proposed noise classification for hearing aids offers less computational complexity without compromising performance.

    摘要--III ABSTRACT--IV Acknowledgments--V Table of Contents--VII List of Figures--X List of Tables--XII Chapter 1 Introduction--1 1.1 The Necessary of Environmental Noise Classification for Hearing Aids--1 1.2 Flow of Environmental Sound Classification System--3 1.3 Audio Signals Feature Extraction and Classifier--4 1.4 The Challenges of Environmental Noise Classification for Hearing Aids--6 1.5 Features of This Thesis--7 1.6 Organization of This Thesis--8 Chapter 2 Related Works--9 2.1 Time-Frequency Representations for Noise Signal--10 2.2 Conventional Noise Classification Algorithms--12 2.3 Deep Convolutional Neural Network--14 Chapter 3 Proposed Method--16 3.1 Inception Block with Dense Connectivity--17 3.2 Depthwise Separable Convolution--20 3.3 Network Structure--22 Chapter 4 Experiment Settings--27 4.1 Dataset--27 4.1.1 UrbanSound8K--27 4.1.2 Hearing Aids Noisy Sound (HANS)--28 4.2 Data Pre-processing--31 4.3 Data Augmentation--33 4.3.1 Raw Data--33 4.3.2 Spectrogram--36 4.4 Training Settings--38 Chapter 5 Experimental Results--40 5.1 Classification Results on UrbanSound8K--42 5.2 Classification Results on the Hearing Aids Noisy Sound (HANS)--46 Chapter 6 Conclusions--49 References--51

    [1] J. Löhler, L.E. Walther, F. Hansen, P. Kapp, J. Meerpohl, B. Wollenberg, R. Schönweiler, and C. Schmucker, "The prevalence of hearing loss and use of hearing aids among adults in Germany: a systematic review," European Archives of Oto-Rhino-Laryngology, Vol. 276, 2019, pp. 945-956.
    [2] W.H. Organization. "Deafness and hearing loss," 2020. Available from: https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss.
    [3] C.D. Mulrow, M.R. Tuley, and C. Aguilar, "Sustained benefits of hearing aids," Journal of Speech, Language, and Hearing Research, Vol. 35, 1992, pp. 1402-1405.
    [4] L. Vestergaard Knudsen, M. öberg, C. Nielsen, G. Naylor, and S.E. Kramer, "Factors influencing help seeking, hearing aid uptake, hearing aid use and satisfaction with hearing aids: A review of the literature," Trends in amplification, Vol. 14, 2010, pp. 127-154.
    [5] A. Skagerstrand, S. Stenfelt, S. Arlinger, and J. Wikström, "Sounds perceived as annoying by hearing-aid users in their daily soundscape," International journal of audiology, Vol. 53, 2014, pp. 259-269.
    [6] W. Shi, X. Zhang, X. Zou, and W. Han, "Deep neural network and noise classification-based speech enhancement," Modern Physics Letters B, Vol. 31, 2017, pp. 1740096.
    [7] G. Park, W. Cho, K.-S. Kim, and S. Lee, "Speech enhancement for hearing aids with deep learning on environmental noises," Applied Sciences, Vol. 10, 2020, pp. 6077.
    [8] L. Zhang, M. Wang, Q. Zhang, and M. Liu, "Environmental attention-guided branchy neural network for speech enhancement," Applied Sciences, Vol. 10, 2020, pp. 1167.
    [9] K. Lee and D.P. Ellis, "Audio-based semantic concept classification for consumer video," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 18, 2009, pp. 1406-1416.
    [10] A. Temko, E. Monte, and C. Nadeu. "Comparison of sequence discriminant support vector machines for acoustic event classification". in 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, 2006, IEEE.
    [11] S. Chu, S. Narayanan, and C.-C.J. Kuo, "Environmental sound recognition with time–frequency audio features," IEEE Transactions on Audio, Speech, and Language Processing, Vol. 17, 2009, pp. 1142-1158.
    [12] J.T. Geiger and K. Helwani. "Improving event detection for audio surveillance using gabor filterbank features". in 2015 23rd European Signal Processing Conference (EUSIPCO), 2015, IEEE.
    [13] X. Valero and F. Alias, "Gammatone cepstral coefficients: Biologically inspired features for non-speech audio classification," IEEE Transactions on Multimedia, Vol. 14, 2012, pp. 1684-1689.
    [14] B. Uzkent, B.D. Barkana, and H. Cevikalp, "Non-speech environmental sound classification using SVMs with a new set of features," International Journal of Innovative Computing, Information and Control, Vol. 8, 2012, pp. 3511-3524.
    [15] D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange, and M.D. Plumbley, "Detection and classification of acoustic scenes and events," IEEE Transactions on Multimedia, Vol. 17, 2015, pp. 1733-1746.
    [16] K.J. Piczak. "ESC: Dataset for environmental sound classification". in Proceedings of the 23rd ACM international conference on Multimedia, 2015.
    [17] Z. Kons, O. Toledo-Ronen, and M. Carmel. "Audio event classification using deep neural networks". in Interspeech, 2013.
    [18] I. Mcloughlin, H. Zhang, Z. Xie, Y. Song, and W. Xiao, "Robust sound event classification using deep neural networks," IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol. 23, 2015, pp. 540-552.
    [19] J. Salamon and J.P. Bello, "Deep convolutional neural networks and data augmentation for environmental sound classification," IEEE Signal processing letters, Vol. 24, 2017, pp. 279-283.
    [20] K. Palanisamy, D. Singhania, and A. Yao, "Rethinking cnn models for audio classification," arXiv preprint arXiv:2007.11154, 2020.
    [21] A. Sehgal and N. Kehtarnavaz, "Guidelines and benchmarks for deployment of deep learning models on smartphones as real-time apps," Machine Learning and Knowledge Extraction, Vol. 1, 2019, pp. 450-465.
    [22] M. Huzaifah, "Comparison of time-frequency representations for environmental sound classification using convolutional neural networks," arXiv preprint arXiv:1706.07156, 2017.
    [23] Y. Su, K. Zhang, J. Wang, D. Zhou, and K. Madani, "Performance analysis of multiple aggregated acoustic features for environment sound classification," Applied Acoustics, Vol. 158, 2020, pp. 107050.
    [24] S.S. Stevens, J. Volkmann, and E.B. Newman, "A scale for the measurement of the psychological magnitude pitch," The journal of the acoustical society of america, Vol. 8, 1937, pp. 185-190.
    [25] M.T. Hagan, H.B. Demuth, and M. Beale, Neural network design. 1997: PWS Publishing Co.
    [26] P. Nordqvist and A. Leijon, "An efficient robust sound classification algorithm for hearing aids," The Journal of the Acoustical Society of America, Vol. 115, 2004, pp. 3033-3041.
    [27] M. Büchler, S. Allegro, S. Launer, and N. Dillier, "Sound classification in hearing aids inspired by auditory scene analysis," EURASIP Journal on Advances in Signal Processing, 2005, pp. 1-12.
    [28] K. Abe, H. Sakaue, T. Okuno, and K. Terada. "Sound classification for hearing aids using time-frequency images". in Proceedings of 2011 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, 2011, IEEE.
    [29] K.J. Piczak. "Environmental sound classification with convolutional neural networks". in 2015 IEEE 25th international workshop on machine learning for signal processing (MLSP), 2015, IEEE.
    [30] Z. Zhang, S. Xu, S. Cao, and S. Zhang. "Deep convolutional neural network with mixup for environmental sound classification". in Chinese conference on pattern recognition and computer vision (prcv), 2018, Springer.
    [31] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. "Going deeper with convolutions". in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015.
    [32] K. He, X. Zhang, S. Ren, and J. Sun. "Deep residual learning for image recognition". in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
    [33] G. Huang, Z. Liu, L. Van Der Maaten, and K.Q. Weinberger. "Densely connected convolutional networks". in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
    [34] J. Singh and R. Joshi. "Background sound classification in speech audio segments". in 2019 International Conference on Speech Technology and Human-Computer Dialogue (SpeD), 2019, IEEE.
    [35] G. Park and S. Lee, "Environmental noise classification using convolutional neural networks with input transform for hearing aids," International journal of environmental research and public health, Vol. 17, 2020, pp. 2270.
    [36] W. Roedily, S.-J. Ruan, and L.P.-H. Li, "Real-Time Noise Classifier on Smartphones," IEEE Consumer Electronics Magazine, Vol. 10, 2020, pp. 37-42.
    [37] A.G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
    [38] M. Lin, Q. Chen, and S. Yan, "Network in network," arXiv preprint arXiv:1312.4400, 2013.
    [39] J. Salamon, C. Jacoby, and J.P. Bello. "A dataset and taxonomy for urban sound research". in Proceedings of the 22nd ACM international conference on Multimedia, 2014.
    [40] B. Mcfee, C. Raffel, D. Liang, D.P. Ellis, M. Mcvicar, E. Battenberg, and O. Nieto. "librosa: Audio and music signal analysis in python". in Proceedings of the 14th python in science conference, 2015, Citeseer.
    [41] B. Mcfee, E.J. Humphrey, and J.P. Bello. "A software framework for musical data augmentation". in ISMIR, 2015.
    [42] H. Zhang, M. Cisse, Y.N. Dauphin, and D. Lopez-Paz, "mixup: Beyond empirical risk minimization," arXiv preprint arXiv:1710.09412, 2017.
    [43] F.C.C.O.a.O. Chollet. "Keras," 2015. Available from: https://keras.io.

    無法下載圖示 全文公開日期 2026/08/27 (校內網路)
    全文公開日期 2026/08/27 (校外網路)
    全文公開日期 2026/08/27 (國家圖書館:臺灣博碩士論文系統)
    QR CODE