Basic Search / Detailed Display

Author: TIAS KURNIATI
TIAS KURNIATI
Thesis Title: A STUDY OF SOUND GENERATION WITH TWO APPROACHES
A STUDY OF SOUND GENERATION WITH TWO APPROACHES
Advisor: 楊傳凱
Chuan-Kai Yang
Committee: 林伯慎
Bor-Shen Lin
賴源正
Yuan-Cheng Lai
楊傳凱
Chuan-Kai Yang
Degree: 碩士
Master
Department: 管理學院 - 資訊管理系
Department of Information Management
Thesis Publication Year: 2018
Graduation Academic Year: 106
Language: 英文
Pages: 43
Keywords (in Chinese): Deep learningsonificationsound generationimageobject detection
Keywords (in other languages): Deep learning, sonification, sound generation, image, object detection
Reference times: Clicks: 357Downloads: 1
Share:
School Collection Retrieve National Library Collection Retrieve Error Report
  • Nowadays, sound generation has become one of the directions in multimedia research. People are searching new methods to generate interesting sounds. Therefore, in this research we address the problem of making multimedia system for sound production from a given image through two different approaches including color based segmentation and object detection. We use jQuery audiosynth.js to generate the sound of notes in the color mapping sonification system, while
    YOLOv3 is used in object detection for sonification. The system will play suitable sound from a local database that matches with the object detected by the system. We choose to implement the systems in a web-based platform using JavaScript associated by node.js with modern web browsers that support Web Audio APIs. In this case, Mozilla Firefox and Google Chrome have already supported this feature. In addition, the web-based sonification system can still be used in different platforms such Android and Windows because it doesn’t depend on the chosen platform. The purpose of the research is to generate a pleasing sound for an image through two approaches presented. A user study was performed to evaluate the systems by using online programs and questionnaires. The results indicate that most of the users agree that the sonification systems
    presented were interesting and unique.


    Nowadays, sound generation has become one of the directions in multimedia research. People are searching new methods to generate interesting sounds. Therefore, in this research we address the problem of making multimedia system for sound production from a given image through two different approaches including color based segmentation and object detection. We use jQuery audiosynth.js to generate the sound of notes in the color mapping sonification system, while
    YOLOv3 is used in object detection for sonification. The system will play suitable sound from a local database that matches with the object detected by the system. We choose to implement the systems in a web-based platform using JavaScript associated by node.js with modern web browsers that support Web Audio APIs. In this case, Mozilla Firefox and Google Chrome have already supported this feature. In addition, the web-based sonification system can still be used in different platforms such Android and Windows because it doesn’t depend on the chosen platform. The purpose of the research is to generate a pleasing sound for an image through two approaches presented. A user study was performed to evaluate the systems by using online programs and questionnaires. The results indicate that most of the users agree that the sonification systems
    presented were interesting and unique.

    Master’s Thesis Recommendation Form i Qualification Form by Master’s Degree Examination Committee ii Abstract iii Acknowledgment iv Table of Contents v List of Tables vii List of Figures viii Chapter 1. Introduction 1 1.1 Background 1 1.2 Aims of Study 1 1.3 YOLO (You Only Look Once) 2 1.4 Research Scope 3 1.5 Research Outline 3 Chapter 2. Related Work 4 2.1 Sonification 4 2.2 The Process of Sound Generation 5 2.2.1 Mapping Approach 5 2.2.2 Object Detection Approach 9 Chapter 3. Proposed System 12 3.1 Overview System 12 3.2 System Architecture 13 3.2.1 Obtaining color information and sound generation (first approach) 13 3.2.2 Object Detection and Sound Generation (second approach) 14 3.3 Node.js 16 3.4 YOLOv3 16 3.4.1 Bounding Box Prediction 16 3.4.2 Class Prediction 18 3.4.3 Prediction Across Scales 18 3.4.4 Feature Extraction 18 3.5 Weights 19 Chapter 4. Experimental Result 22 4.1 Experiments 22 4.2 Results 24 4.2.1 Color Mapping Sonification 24 4.2.2 Object Detection Sonification 26 Chapter 5. Conclusions and Future Work 28 5.1 Conclusions 28 5.2 Limitation and Future Work 28 References 30 Appendix 32

    [1] S. Cavaco, J. T. Henriques, M. Mengucci, N. Correia and F. Medeiros, "Color Sonification for the Visually Impaired," Procedia Technology 9, pp. 1048-1057, 2013.
    [2] T. Wörtwein, B. Schauerte, K. E. Mueller and R. Stiefelhagen, "Interactive Web-based Image Sonification for the Blind," ICMI '15 Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, pp. 375-376 , 2015.
    [3] T. Wörtwein , B. Schauerte, K. Müller and R. Stiefelhagen, "Mobile Interactive Image Sonification for the Blind," International Conference on Computers Helping People with Special Needs, vol. 9758, pp. 212-219, 2016.
    [4] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once : Unified, Real-Time Object Detection," Computer Vision and Pattern Recognition (CVPR), pp. ISSN 1063-6919, 2016 IEEE.
    [5] C. M. Fernandes, D. Migotina and A. C. Rosa, "Sonification of Sleep EEG," ARTECH2017 Proceedings of the 8th International Conference on Digital Arts, pp. 161-164 , 2017.
    [6] D. Migotina and A. Rosa, "Segmentation of Sleep EEG Signal by Optimal Thresholds," Proceedings of the BioMed2012, pp. 114-121, 2012.
    [7] R. F. Cohen, V. Haven, J. A. Lanzoni and A. Meacham, "Using an Audio Interface to Assist Users Who are Visually Impaired with Steering Tasks," Assets '06 Proceedings of the 8th international ACM SIGACCESS conference on Computers and Accessibility, pp. 119-124 , 2006.
    [8] S. Hasegawa, S. Ishijima, F. Kato, H. Mitake and M. Sato, "Realtime Sonification of the Center of Gravity for Skiing," AH '12 Proceedings of the 3rd Augmented Human International Conference, 2012.
    [9] S. Serafin and G. Serafin, "Sound Design to Enhance Presence in Photorealistic Virtual Reality," Proceedings of ICAD 04-Tenth Meeting of the International Conference on Auditory Display, 2004.
    [10] T. Yoshida, K. M. Kitani, H. Koike, S. Belongie and K. Sclei, "EdgeSonic: Image Feature Sonification for the Visually Impaired," Proceedings of the 2nd Augmented Human International Conference, 2011.
    [11] A. C. Marruffo, "Automatic Sonification of Video Sequences Through Object Detection and Physical Modelling," Aalborg University Copenhagen, Denmark, 2017.
    [12] M. Banf and V. Blanz, "Sonification of Images for the Visually Impaired using a Muti-Level Approach," Proceedings of the 4th Augmented Human International Conference, pp. 162-169, 2013.
    [13] "Node.js," [Online]. Available: https://nodejs.org/.
    [14] K. Lei, Y. Ma and Z. Tan, "Performance Comparison and Evaluation of Web Development Technologies in PHP, Python and Node.js," IEEE 17th International Conference on Computational Science and Engineering, 2014.
    [15] J. Redmon and . A. Farhadi, "YOLOv3: An Incremental Improvement," 2018.
    [16] T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature Pyramid Networks for Object Detection," Proceedings of the IEEE Conference on Computer Vision, p. 2117–2125, 2017.
    [17] "YOLO: Real-Time Object Detection," [Online]. Available: https://pjreddie.com/darknet/yolo/.
    [18] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection".

    無法下載圖示 Full text public date 2019/07/13 (Intranet public)
    Full text public date This full text is not authorized to be published. (Internet public)
    Full text public date This full text is not authorized to be published. (National library)
    QR CODE