Basic Search / Detailed Display

Author: Rodiatul Adawiya Abdul Rahman
Rodiatul Adawiya Abdul Rahman
Thesis Title: Mobile Application for Real-Time Bird Sound Recognition using Convolutional Neural Network
Mobile Application for Real-Time Bird Sound Recognition using Convolutional Neural Network
Advisor: 楊傳凱
Yang, Chuan-Kai
Committee: 賴源正
Yuan-Cheng Lai
Bor-Shen Lin
Degree: 碩士
Department: 管理學院 - 資訊管理系
Department of Information Management
Thesis Publication Year: 2020
Graduation Academic Year: 108
Language: 英文
Pages: 53
Keywords (in Chinese): Audio FeaturesBioacousticsBird Sound RecognitionConvolutional Neural NetworksMobile-based Application
Keywords (in other languages): Audio Features, Bioacoustics, Bird Sound Recognition, Convolutional Neural Networks, Mobile-based Application
Reference times: Clicks: 179Downloads: 0
School Collection Retrieve National Library Collection Retrieve Error Report

Master's Thesis Recommendation Form I Qualification Form by Master's Degree Examination Committee II Abstract III Acknowledgment IV Table of Contents V List of Figures VII List of Tables VIII Chapter 1. Introduction 1 1.1 Background 1 1.2 Contribution 2 1.3 Research Outline 3 Chapter 2. Related Works 4 2.1 Animal Sound Recognition 4 2.2 Bird Sound Recognition 5 2.2.1 Bird Sound Recognition using Traditional Approach 5 2.2.2 Bird Sound Recognition using CNN 6 2.3 Convolutional Neural Networks (CNN) 8 Chapter 3. Proposed System 14 3.1 System Overview 14 3.2 System Architecture 15 3.3 Generating Spectrograms 17 3.4 Dataset 18 3.5 Dataset Augmentation 22 3.6 Training the CNN Model 22 3.7 Recognizing the Bird Sound 26 Chapter 4. Experimental Results 28 4.1 Experiments 28 4.2 Comparison Results 38 Chapter 5. Conclusion and Discussion 40 5.1 Conclusion 40 5.2 Limitation and Future Work 40 References 42

[1] Potamitis, I. “Unsupervised dictionary extraction of bird vocalisations and new tools on assessing and visualising bird activity”. Eco. Inform. 26, Part 3, 6–17, 2015.
[2] Stastny, J., Munk, M., Juranek, L. “Automatic bird species recognition based on bird’s vocalization”. EURASIP Journal on Audio, Speech, and Music Processing, 2018.
[3] Dong, X., Towsey, M., Truskinger, A., Cottman-Fields, M., Zhang, J., Roe, P. “Similarity-based birdcall retrieval from environmental audio”. Eco. Inform. 29, Part 1, 66–76, 2015.
[4] Albornoz, E.M, Vignolo, L.D, Sarquis, J.A, Leon, E. “Automatic classification of Furnariidae species from the Paranaense Littoral region using speech-related features and machine learning”. Ecological Informatics 38, 39–49, 2017.
[5] Sprengel, E., Martin Jaggi, Y. K., & Hofmann, T. “Audio based bird species identification using deep learning techniques”. Working notes of CLEF, 2016.
[6] Kontas, M. “Sound-Based Bird Classification: How a group of Polish women used deep learning, acoustics and ornithology to classify birds”. (2020, April 27). Retrieved from Toward Data Science:
[7] Kahl, “Large-Scale Bird Sound Classification using Convolutional Neural Networks”. Computer Science-Published in CLEF, 2017.
[8] Yeo, C.Y, Al-Haddad, S.A.R, Ng, C.K. “Animal Voice Recognition for Identification (ID) Detection System”. IEEE 7th International Colloquium on Signal Processing and its Applications, 2011.
[9] Moscow Zoo, “Gallery of animals’ sounds,”, Accessed on May 28, 2020
[10] Bang, A.V, Rege, P. P. “Recognition of Bird Species from their Sounds using Data Reduction Techniques”. ICCCT-2017: Proceedings of the 7th International Conference on Computer and Communication Technology, 2017.
[11] Kaminska, D., Gmerek, A. “Automatic identification of bird species: A comparison between KNN and SOM classifiers”. IEEE Joint Conference New Trends in Audio & Video and Signal Processing: Algorithms, Architectures, Arrangements and Applications, 2012.
[12] Cai, J., Ee, D., Pham, B., Roe, P. and Zhang, J. “Sensor network for the monitoring of ecosystem: Bird species recognition”. IEEE 3rd International Conference on Intelligent Sensors, Sensor Networks and Information (ISSNIP 2007), pp. 293-298, 2007.
[13] Kun, Q., Zixing, Z., Ringeval, F., Schuller, B. “Bird Sounds Classification by Large Scale Acoustic Features and Extreme Learning Machine”. IEEE Global Conference on Signal and Information Processing, 2015.
[14] Tóth, B.P, Czeba, B. “Convolutional Neural Networks for Large-Scale Bird Song Classification in Noisy Environment”. Computer Science-Published in CLEF, 2016.
[15] Narasimhan, R., Fern, X. Z., Raich, R. “Simultaneous segmentation and classification of bird song using CNN”. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
[16] Koh, C.Y, et al. “Bird Sound Classification using Convolutional Neural Networks”. Working Notes of CLEF 2019, 2019.
[17] Hinton, G., et al. “Deep neural networks for acoustic modelling in speech recognition: The shared views of four research groups”. IEEE Signal Processing Magazine, 29(6), pp.82-97, 2012.
[18] Müller, L., Marti, M. “Bird sound classification using a bidirectional LSTM”. Working Notes of CLEF 2018 (Cross Language Evaluation Forum), 2018.
[19] Russakovsky, O., et all. “Imagenet large scale visual recognition challenge”. International Journal of Computer Vision, 115(3), 211-252, 2015.
[20] Krizhevsky, A., Sutskever, I., Hinton, G. E. “Imagenet classification with deep convolutional neural networks”. Advances in neural information processing systems, pp. 1097-1105, 2012.
[21] Ioffe, S., Christian, S. “Batch normalization: Accelerating deep network training by reducing internal covariate shift.” arXiv preprint arXiv:1502.03167, 2015.
[22] Nair, V., Hinton, G. E. “Rectified linear units improve restricted boltzmann machines”. Proc. 27th International Conference on Machine Learning, pp. 807-814, 2010.
[23] Pham, A. T., Fern, X. Z., Raich, R. “Dynamic programming for instance annotation in multi-instance multi-label learning”. IEEE Trans. on PAMI, 2014.
[24] Zhou, Z. H., Zhang, M. L., Huang, S. J., Li, Y. F. “Multi-instance multi-label learning,” Artificial Intelligence, pp. 2291–2320, 2012.
[25] He, K., Zhang, X., Ren, S., Sun, J. “Deep residual learning for image recognition”. Proc. IEEE Conf. Computer Vision and Pattern Recognition. pp. 770–778, 2016.
[26] Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z. “Rethinking the inception architecture for computer vision”. Proc. IEEE Conf. Computer Vision and Pattern Recognition. pp. 2818–2826, 2016.
[27] Kingma, D., Ba, J. “Adam: A method for stochastic optimization”. arXiv preprint, arXiv:1412.6980, 2014.
[28] Lasseck, M. “Bird song classification in field recordings: winning solution for NIPS4B 2013 competition”. Proc. Int. Symp. Neural Information Scaled for Bioacoustics,, joint to NIPS, Nevada. pp. 176–181, 2013.

無法下載圖示 Full text public date 2025/08/05 (Intranet public)
Full text public date 2025/08/05 (Internet public)
Full text public date 2025/08/05 (National library)