簡易檢索 / 詳目顯示

研究生: 曾政傑
Jheng Jie Zeng
論文名稱: 基於多重訊號分類之聲源方位偵測
Sound Source Localization Based on Multiple Signal Classification
指導教授: 古鴻炎
Hung-Yan Gu
口試委員: 鍾國亮
Kuo-Liang Chung
林伯慎
Bor- Shen Lin
王新民
Hsin-Min Wang
余明興
Ming-Shing Yu
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 中文
論文頁數: 73
中文關鍵詞: 多重訊號分類麥克風陣列語音活動偵測脈衝展延訊號最大相似性波束集成器
外文關鍵詞: Multiple Signal Classification, Microphone Array, Voice Activity Detection, Time-Stretched Pulse, Maximum Likelihood Beamformer
相關次數: 點閱:204下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文研究MUSIC(Mutiple Signal Classification)演算法,並且使用它來處理四聲道麥克風陣列的輸入,以實作出一個聲源方位偵測的系統。硬體方面,自行設計麥克風的放大電路,再透過USB介面的資料擷取器(DAQ)將訊號擷取至電腦做運算。軟體方面,先透過VAD(Voice Activity Detection)將訊號判定為語音或非語音,再使用MUSIC演算法,判斷出聲源的方位,然後用以更新最大相似性可調適濾波器(Maximum Likelihood Adaptive Filter)的係數。在做MUSIC演算法的計算之前,須先經由分析TSP(Time-Stretched Pulse)訊號來求得MUSIC所需的脈衝頻率響應係數,這個TSP訊號分析是一個困難的步驟。將軟硬體組合起來後,進行方位角度偵測的評估實驗,初步實驗數據顯示出,只有在半無響室裡才會有較好的偵測效果,而其它環境下的偵測效果並不是很好,而仍有改進的空間。


    In this thesis, we study the MUSIC(Mutiple Signal Classification) algorithm, and use it to process the signals acquired from a four-channel microphone array, in order to build a sound source detection system. For hardware implementation, we design a preamplifier circuit for the microphones, and use a DAQ(Data acquisition)with USB interface to transmit signal data to a computer. In software implementation, at first we determine if the input signal is speech by using a VAD(Voice Activity Detection)module, then we determine the direction of the sound source by using the MUSIC module, and finally we update the coefficients of the Maximum Likelihood Adaptive Filter. Beacuse impulse frequency response coefficients are require by the MUSIC algorithm, we have to carry out the hard step: analyzing the TSP(Time-Stretched Pulse)signals. After the hardware and software modules are integrated, we start to evaluate the detection of a sound source’s direction angle. The initial experiment results show, only in the semi- anechoic chamber can accurate angle detection be achieved, while in other circumstances the angle detection accuracy is not satisfactory and needs to be improved.

    摘要 I ABSTRACT II 誌謝 III 目錄 IV 圖表索引 VI 第1章 緒論 1 1.1 研究動機及目的 1 1.2 文獻回顧 2 1.3 研究方法 9 1.4 論文架構 11 第2章 多重訊號分類之聲源角度偵測 13 2.1 聲源角度偵測及語音強化之架構 13 2.2 多重訊號分類之聲源定位方法 15 2.2.1 資料模型 15 2.2.2 多重訊號分類法 16 2.3 麥克風脈衝響應之分析 21 2.3.1 時間展延脈衝訊號介紹 21 2.3.2 時間展延脈衝訊號之產生 24 2.3.3 麥克風頻率響應之分析流程 26 第3章 語音活動偵測與語音強化系統 28 3.1 語音活動偵測之方法 28 3.1.1 語音活動偵測介紹 28 3.1.2 語音活動偵測之方法 28 3.2 適應性訊號處理 32 3.2.1 適應性濾波器介紹 32 3.2.2 最大相似性波束集成法之適應性濾波器 33 第4章 系統製作 36 4.1 硬體組件 36 4.1.1 麥克風 36 4.1.2 訊號放大電路 38 4.1.3 DAQ 39 4.2 系統之軟體實作與參數設定 41 4.2.1 麥克風陣列頻率響應之量測與分析 41 4.2.2 語音活動偵測之實作 46 4.2.3 多重訊號分類聲源角度偵測之改進與實作 50 4.2.4 最大相似性波束集成法之適應性濾波器實作 53 4.2.5 系統之軟體介面 54 第5章 測試實驗 58 5.1 線外系統測試 58 5.2 線上系統測試 64 第6章 結論 67 參考文獻 70 作者簡介 73

    [1] R. O. Schmidt, "Multiple emitter location and signal parameter estimation", IEEE Trans. Antennas Propag, Vol. AP-34, no. 3, pp.276-280, March 1986.
    [2] Javier Ramirez, Jose C. Segura, Carmen Benitez, Angel de la Torre and Antonio Rubio, "Efficient voice activity detection algorithms using long-term speech information", Speech Communication, Vol. 42, Issues 3-4, pp. 271-287, April 2004.
    [3] D. Johnson and D. Dudgeon, Array Signal Processing:Concepts and Techniques, Prentice Hall, Englewood Cliff, New Jersey, 1993.
    [4] J. L. Flanagan, L. Landgraf, D. J. McLean, "Matched-filter processing of hydrophone array", J. Acousr. Soc. Am. Vol. 42, pp.1165, November 1967.
    [5] National Radio Astronomy Observatory(NRAO), 網頁資料:http://www.vla.nrao.edu/
    [6] B. L. Sim, Y. C. Tong, J. S. Chang and C. T. Tan, "A parametric formulation of the generalized spectral subtraction method", IEEE Trans. Speech and Audio Processing, Vol. 6, pp. 328-337, July 1998.
    [7] Y. Ephraim and H. L. Van Trees, "A signal subspace approach for speech enhancement", IEEE Trans. Speech and Audio Processing, Vol. 3, No. 4, pp. 251-266, July 1995.
    [8] Asano F. , Motomura Y. , Asoh H. , Yoshimura T. ,Ichimura N. , Nakamura S. , "Fusion of audio and video information for detecting speech events", in Proc. Fusion 2003, pp. 386-393, 2003.
    [9] Asano F. , Asoh H. , Matsi T. , "Sound source localication and signal separation for office robot “Jijo-2” ", in IEEE Proc. , Multisensor Fusion and Integration for Intelligent Systems, pp. 243-248, August 1999.
    [10] Nakadai K. , Hidai K. , Mizoguchi, H. , Hiroshi G. Okuno, Kitano H. , "Real-Time Auditory and Visual Multiple-Object Tracking for Humanoids", IJCAI 2001, pp. 1425-1436.
    [11] W. Tager. , "Near field superdirectivity(NFSD)", International Conference on Acoustics, Speech, and Signal Processing(ICASSP), Vol. 4, pp. 2045-2048, May 1998.
    [12] M. D. Zoltowski , C. P. Mathews. "Real-time frequency and 2-D angle estimation with sub-nyquist spatio-temporal sampling", IEEE Tran. , SP-42 , pp. 2781~2794, 1994.
    [13] D. Giuliani, M. Omologo and P. Svaizer, "Experoments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaption", Proceeding of international conference on Spoken Language Processing(ICSLP), pp. 1329-1332, October. 1996.
    [14] Y. Tamai, S. Kagami, Y. Amemiya and H. Nagashima, "Circular. Microphone Array for Robot’s Audition", Proceedings of the Third. IEEE International Conference on Sensors (SENSORS2004), 2004.
    [15] Geert Van Meerbergen, Audio en spraakverwerking. http://homes.esat.kuleuven.be/~gvanmeer/s&a/
    [16] Ta-Sung Lee, Tsui-Tsai Lin, "Coherent intefrence suppression with complementally transformed adaptive beamformer", Antennas and Propagation, IEEE Transactions, Vol. 46, Issue 5, pp. 609-617, May 1998.
    [17] Gollamudi, S. , Yih-Fang Huang , "Optimally combined nonlinear MMSE beamforming and interference cancellation for CDMA communications", Personal Wireless Communications, 2000 IEEE International Conference, pp. 474-478, 2000.
    [18] Pillai, S. Unnikrishna, Array signal processing, 1989.
    [19] 陳益正,使用強健性時間延遲與訊號子空間方法於麥克風陣列語音加強,國立成功大學資訊工程研究所,2003。
    [20] 張凱行,一個基於子空間追蹤演算法之語音強健系統及其硬體設計,國立成功大學電機工程研究所,2004。
    [21] 孫藍蕙,語音活動偵測與適應性聲源定位處理器之設計與實現,國立交通大學電機與控制工程研究所,2006。
    [22] Suzuki Y. , Asano F. , H.-Y. Kim , Toshio Sone, "An optimum computer-generated pulse signal suitable for the measurement of very long impulse responses", J. Acoust. Soc. Am. Vol. 97(2) , pp.1119-1123, 1995.
    [23] European Digital Cellular Telecommunications System ; Half rate speech part 6 : Voice Activity Detector (VAD) for half rate speech traffic channels ( ETSI GSM 6.42 ) , 1995.
    [24] European Digital Cellular Telecommunications System ; Half rate Speech ; Half rate speech transcoding ( ETSI GSM 6.20 ) , 1995.
    [25] Aoshima N. “Computer-generated pulse signal applied for sound measurement.” J. Acoust. Soc. Am. Vol. 69, pp. 1484-1488, 1981.
    [26] National Semiconductor, LM386 low voltage audio power amplifier datasheet.
    http://www.national.com/ds/LM/LM386.pdf
    [27] Wikipedia, Low-pass filter.
    http://en.wikipedia.org/wiki/Low-pass_filter
    [28] National Instruments, Low-cost multifunction DAQ for USB.
    http://www.ni.com/pdf/products/us/20043762301101dlr.pdf

    QR CODE