簡易檢索 / 詳目顯示

研究生: 李柏諺
Bo-yan Li
論文名稱: 自然景觀聽覺化
Natural Landscape Sonification
指導教授: 楊傳凱
Chuan-Kai Yang
口試委員: 林伯慎
Bor-Shen Lin
梁容輝
Rung-Huei Liang
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2012
畢業學年度: 100
語文別: 中文
論文頁數: 47
中文關鍵詞: 影像聽覺化音景物件辨識音訊處理
外文關鍵詞: Image Sonification, Soundscape, Object recognition, Audio Processing
相關次數: 點閱:191下載:9
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

隨著生活品質的變好,複合性的需求漸漸被提出,像是視覺、嗅覺和味覺的結合,在一道料理上不僅味覺要好吃,且在視覺上若有優美畫面的搭配,會使得用餐者覺得更加美味;視覺與聽覺的結合,現在播放音樂不只是單純的聲音放送,也會依據播放音樂的波長、節奏搭配光譜效果,使得人們有雙重的享受。因此激發出我的想法,若能在欣賞一張自然影像時,隨著注視的位置不同能自動播放相對應的聲音,能讓使用者不僅可以從視覺上欣賞影像,也能從聽覺上感受到影像中的內容,藉此來達成視覺、聽覺之間的互動。
因此,本論文提出一個聽圖的概念,在使用者觀看影像時,能透過聽覺的輔助來了解圖像中的內容。採用語義上的聲音,人們可以很自然的從聲音對應到影像,進而觀看影像時能有更深刻的印象。論文的主要架構是利用顏色動差的方式辨識出影像內容,給予語義上的聲音,再透過顯著圖與區域富豐性來呈現同一物件在不同細節情況,使得觀看者能在視覺和聽覺的共同合作下,對影像能有更深刻的印象。


In modern society, it has become quite common for people to pursue satisfactory perceptions from more than one source. For example, a perfect meal should come with not only good looking, but also good flavor and smell. Similarly, a music player could not only play a melody, but also display its change in terms of length and tempo through a visual interface. Such a concept could also be applied in viewing an image. That is, when a viewer stares at an image, then some sound can be generated. As a result, an image can not only be seen, but be heard as well.
In this paper, we propose an idea of image sonification. When a viewer looks the image through a mouse cursor, our system generates a sound semantically according to the corresponding image’s content. We make use color moment to distinguish the focused image’s content, generate a semantic sound, and change the sound’s parameters according to image’s saliency map and the color richness of the focused region. The viewer may have a better impression of this image due to the stimuli from both vision and sound.

目錄 1. 緒論 1 1.1 研究動機與目的 1 1.2 論文架構 2 2. 文獻探討 3 2.1 物件辨識 3 2.2 聽覺化研究 6 2.3 聲音調變 12 3. 影像與音訊處理 13 3.1 影像處理:顏色動差 14 3.2 系統流程 17 4. 系統實作 18 4.1 主影像處理 18 4.2 音訊調整 20 4.3 背景音樂 24 4.4 手動變更 26 5. 結果 28 5.1 系統環境 28 5.2 結果展示 29 5.3 功能性比較 33 5.4 績效評估 34 6. 結論與未來展望 37 參考文獻 38

[1]. I. Laptev, “Improvements of Object Detection Using Boosted Histograms,” Proceedings of the 17th British Machine Vision Conference, 2006.
[2]. W.S. Yeo, J. Berger, “Application of Image Sonification Methods to Music,” International Computer Music Conference, 2005.
[3]. F. Grond, S. Janssen, S Schirmer, T Hermann, “Browsing RNA Structures by Interactive Sonification,” Processdings of ISon 2010.
[4]. J.L. Shih, L.H. Chen, “Colour image retrieval based on primitives of colour moments,” Image signal Processing 2002.
[5]. C. Gu, J.J. Lim, P Arbelaez, J Malik, “Recognition using Regions,” Computer Vision and Pattern Recognition, 2009.
[6]. W.S. Yeo, J. Berger, “Raster Scanning: A New Approach to Image Sonification, Sound Visualization, Sound Analysis And Synthesis,” International Computer Music Conference Proceedings 2006.
[7]. T. Hermann, A. Hunt, “An Introduction to Interactive Sonification,” IEEE Multimedia 2005.
[8]. T. Malisiewicz, A.A. Efros, “ Recognition by Association via Learning Per-exemplar Distance,” Computer Vision and Pattern Recognition 2008
[9]. M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M. Gorkani, J. Hafner, D. Lee, D. Petkovie, D. Steele, P. Yanker, “Query by Image and Video Content: The QBIC System,” IEEE Computer 1995.
[10]. S. Roucos, A.M Wilgus, “High quality time-scale modification for speech” ICASSP, pp. 493-496, 1985
[11]. J. Laroche, M. Dolson, “Improved Phase Vocoder Time-Scale Modification of Audio,” IEEE Trans. On Speech and Audio Processing, 1999.
[12]. M. Flickner, H. Sawhney, W.ayne Niblack, “Query by image and video content: the QBIC system,” IEEE Computer Society 1995
[13]. M. Dolson, “The phase Vocoder: A tutorial,” Computer. Music Journal 1986.
[14]. D. Doman, R. Lawlor, E. Coyle, “Time-scale modification of speech using a synchronised and Adaptive Overlap-Add (SAOLA) Algorithm,” Audio Engineering Society Convention 2003
[15]. B.Schiele, J.Crowley, “Recognition without correspondence using multidimensional receptive field histogram,” International Journal of Computer Vision 2000
[16]. N. Akrout, R. Prost, R. Goutte, “Image compression by vector quantization: a review focused on codebook generation,” Image and Vision Computing. 1994
[17]. L. Ltti, C. Koch, and E. Niebur “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence 1998
[18]. D. Frohlich, E. Tallyn “Audiophotography: practice and prospects,” ACM SIGCHI Conference on Human Factors in Computing Systems. 1999
[19]. T. Hemann, A. Hunt, “An Introduction to Interactive Sonification,” IEEE Multimedia 2005
[20]. 陳顯榮, 黃政杰, 陳俊伯, 李佑聰, “倒聽圖說” 台灣科技大學資訊管理系實務專題 2010

QR CODE