研究生: |
李柏諺 Bo-yan Li |
---|---|
論文名稱: |
自然景觀聽覺化 Natural Landscape Sonification |
指導教授: |
楊傳凱
Chuan-Kai Yang |
口試委員: |
林伯慎
Bor-Shen Lin 梁容輝 Rung-Huei Liang |
學位類別: |
碩士 Master |
系所名稱: |
管理學院 - 資訊管理系 Department of Information Management |
論文出版年: | 2012 |
畢業學年度: | 100 |
語文別: | 中文 |
論文頁數: | 47 |
中文關鍵詞: | 影像聽覺化 、音景 、物件辨識 、音訊處理 |
外文關鍵詞: | Image Sonification, Soundscape, Object recognition, Audio Processing |
相關次數: | 點閱:347 下載:9 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著生活品質的變好,複合性的需求漸漸被提出,像是視覺、嗅覺和味覺的結合,在一道料理上不僅味覺要好吃,且在視覺上若有優美畫面的搭配,會使得用餐者覺得更加美味;視覺與聽覺的結合,現在播放音樂不只是單純的聲音放送,也會依據播放音樂的波長、節奏搭配光譜效果,使得人們有雙重的享受。因此激發出我的想法,若能在欣賞一張自然影像時,隨著注視的位置不同能自動播放相對應的聲音,能讓使用者不僅可以從視覺上欣賞影像,也能從聽覺上感受到影像中的內容,藉此來達成視覺、聽覺之間的互動。
因此,本論文提出一個聽圖的概念,在使用者觀看影像時,能透過聽覺的輔助來了解圖像中的內容。採用語義上的聲音,人們可以很自然的從聲音對應到影像,進而觀看影像時能有更深刻的印象。論文的主要架構是利用顏色動差的方式辨識出影像內容,給予語義上的聲音,再透過顯著圖與區域富豐性來呈現同一物件在不同細節情況,使得觀看者能在視覺和聽覺的共同合作下,對影像能有更深刻的印象。
In modern society, it has become quite common for people to pursue satisfactory perceptions from more than one source. For example, a perfect meal should come with not only good looking, but also good flavor and smell. Similarly, a music player could not only play a melody, but also display its change in terms of length and tempo through a visual interface. Such a concept could also be applied in viewing an image. That is, when a viewer stares at an image, then some sound can be generated. As a result, an image can not only be seen, but be heard as well.
In this paper, we propose an idea of image sonification. When a viewer looks the image through a mouse cursor, our system generates a sound semantically according to the corresponding image’s content. We make use color moment to distinguish the focused image’s content, generate a semantic sound, and change the sound’s parameters according to image’s saliency map and the color richness of the focused region. The viewer may have a better impression of this image due to the stimuli from both vision and sound.
[1]. I. Laptev, “Improvements of Object Detection Using Boosted Histograms,” Proceedings of the 17th British Machine Vision Conference, 2006.
[2]. W.S. Yeo, J. Berger, “Application of Image Sonification Methods to Music,” International Computer Music Conference, 2005.
[3]. F. Grond, S. Janssen, S Schirmer, T Hermann, “Browsing RNA Structures by Interactive Sonification,” Processdings of ISon 2010.
[4]. J.L. Shih, L.H. Chen, “Colour image retrieval based on primitives of colour moments,” Image signal Processing 2002.
[5]. C. Gu, J.J. Lim, P Arbelaez, J Malik, “Recognition using Regions,” Computer Vision and Pattern Recognition, 2009.
[6]. W.S. Yeo, J. Berger, “Raster Scanning: A New Approach to Image Sonification, Sound Visualization, Sound Analysis And Synthesis,” International Computer Music Conference Proceedings 2006.
[7]. T. Hermann, A. Hunt, “An Introduction to Interactive Sonification,” IEEE Multimedia 2005.
[8]. T. Malisiewicz, A.A. Efros, “ Recognition by Association via Learning Per-exemplar Distance,” Computer Vision and Pattern Recognition 2008
[9]. M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovie, D. Steele, P. Yanker, “Query by Image and Video Content: The QBIC System,” IEEE Computer 1995.
[10]. S. Roucos, A.M Wilgus, “High quality time-scale modification for speech” ICASSP, pp. 493-496, 1985
[11]. J. Laroche, M. Dolson, “Improved Phase Vocoder Time-Scale Modification of Audio,” IEEE Trans. On Speech and Audio Processing, 1999.
[12]. M. Flickner, H. Sawhney, W.ayne Niblack, “Query by image and video content: the QBIC system,” IEEE Computer Society 1995
[13]. M. Dolson, “The phase Vocoder: A tutorial,” Computer. Music Journal 1986.
[14]. D. Doman, R. Lawlor, E. Coyle, “Time-scale modification of speech using a synchronised and Adaptive Overlap-Add (SAOLA) Algorithm,” Audio Engineering Society Convention 2003
[15]. B.Schiele, J.Crowley, “Recognition without correspondence using multidimensional receptive field histogram,” International Journal of Computer Vision 2000
[16]. N. Akrout, R. Prost, R. Goutte, “Image compression by vector quantization: a review focused on codebook generation,” Image and Vision Computing. 1994
[17]. L. Ltti, C. Koch, and E. Niebur “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence 1998
[18]. D. Frohlich, E. Tallyn “Audiophotography: practice and prospects,” ACM SIGCHI Conference on Human Factors in Computing Systems. 1999
[19]. T. Hemann, A. Hunt, “An Introduction to Interactive Sonification,” IEEE Multimedia 2005
[20]. 陳顯榮, 黃政杰, 陳俊伯, 李佑聰, “倒聽圖說” 台灣科技大學資訊管理系實務專題 2010