研究生: |
曾政傑 Jheng Jie Zeng |
---|---|
論文名稱: |
基於多重訊號分類之聲源方位偵測 Sound Source Localization Based on Multiple Signal Classification |
指導教授: |
古鴻炎
Hung-Yan Gu |
口試委員: |
鍾國亮
Kuo-Liang Chung 林伯慎 Bor- Shen Lin 王新民 Hsin-Min Wang 余明興 Ming-Shing Yu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2008 |
畢業學年度: | 96 |
語文別: | 中文 |
論文頁數: | 73 |
中文關鍵詞: | 多重訊號分類 、麥克風陣列 、語音活動偵測 、脈衝展延訊號 、最大相似性波束集成器 |
外文關鍵詞: | Multiple Signal Classification, Microphone Array, Voice Activity Detection, Time-Stretched Pulse, Maximum Likelihood Beamformer |
相關次數: | 點閱:204 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文研究MUSIC(Mutiple Signal Classification)演算法,並且使用它來處理四聲道麥克風陣列的輸入,以實作出一個聲源方位偵測的系統。硬體方面,自行設計麥克風的放大電路,再透過USB介面的資料擷取器(DAQ)將訊號擷取至電腦做運算。軟體方面,先透過VAD(Voice Activity Detection)將訊號判定為語音或非語音,再使用MUSIC演算法,判斷出聲源的方位,然後用以更新最大相似性可調適濾波器(Maximum Likelihood Adaptive Filter)的係數。在做MUSIC演算法的計算之前,須先經由分析TSP(Time-Stretched Pulse)訊號來求得MUSIC所需的脈衝頻率響應係數,這個TSP訊號分析是一個困難的步驟。將軟硬體組合起來後,進行方位角度偵測的評估實驗,初步實驗數據顯示出,只有在半無響室裡才會有較好的偵測效果,而其它環境下的偵測效果並不是很好,而仍有改進的空間。
In this thesis, we study the MUSIC(Mutiple Signal Classification) algorithm, and use it to process the signals acquired from a four-channel microphone array, in order to build a sound source detection system. For hardware implementation, we design a preamplifier circuit for the microphones, and use a DAQ(Data acquisition)with USB interface to transmit signal data to a computer. In software implementation, at first we determine if the input signal is speech by using a VAD(Voice Activity Detection)module, then we determine the direction of the sound source by using the MUSIC module, and finally we update the coefficients of the Maximum Likelihood Adaptive Filter. Beacuse impulse frequency response coefficients are require by the MUSIC algorithm, we have to carry out the hard step: analyzing the TSP(Time-Stretched Pulse)signals. After the hardware and software modules are integrated, we start to evaluate the detection of a sound source’s direction angle. The initial experiment results show, only in the semi- anechoic chamber can accurate angle detection be achieved, while in other circumstances the angle detection accuracy is not satisfactory and needs to be improved.
[1] R. O. Schmidt, "Multiple emitter location and signal parameter estimation", IEEE Trans. Antennas Propag, Vol. AP-34, no. 3, pp.276-280, March 1986.
[2] Javier Ramirez, Jose C. Segura, Carmen Benitez, Angel de la Torre and Antonio Rubio, "Efficient voice activity detection algorithms using long-term speech information", Speech Communication, Vol. 42, Issues 3-4, pp. 271-287, April 2004.
[3] D. Johnson and D. Dudgeon, Array Signal Processing:Concepts and Techniques, Prentice Hall, Englewood Cliff, New Jersey, 1993.
[4] J. L. Flanagan, L. Landgraf, D. J. McLean, "Matched-filter processing of hydrophone array", J. Acousr. Soc. Am. Vol. 42, pp.1165, November 1967.
[5] National Radio Astronomy Observatory(NRAO), 網頁資料:http://www.vla.nrao.edu/
[6] B. L. Sim, Y. C. Tong, J. S. Chang and C. T. Tan, "A parametric formulation of the generalized spectral subtraction method", IEEE Trans. Speech and Audio Processing, Vol. 6, pp. 328-337, July 1998.
[7] Y. Ephraim and H. L. Van Trees, "A signal subspace approach for speech enhancement", IEEE Trans. Speech and Audio Processing, Vol. 3, No. 4, pp. 251-266, July 1995.
[8] Asano F. , Motomura Y. , Asoh H. , Yoshimura T. ,Ichimura N. , Nakamura S. , "Fusion of audio and video information for detecting speech events", in Proc. Fusion 2003, pp. 386-393, 2003.
[9] Asano F. , Asoh H. , Matsi T. , "Sound source localication and signal separation for office robot “Jijo-2” ", in IEEE Proc. , Multisensor Fusion and Integration for Intelligent Systems, pp. 243-248, August 1999.
[10] Nakadai K. , Hidai K. , Mizoguchi, H. , Hiroshi G. Okuno, Kitano H. , "Real-Time Auditory and Visual Multiple-Object Tracking for Humanoids", IJCAI 2001, pp. 1425-1436.
[11] W. Tager. , "Near field superdirectivity(NFSD)", International Conference on Acoustics, Speech, and Signal Processing(ICASSP), Vol. 4, pp. 2045-2048, May 1998.
[12] M. D. Zoltowski , C. P. Mathews. "Real-time frequency and 2-D angle estimation with sub-nyquist spatio-temporal sampling", IEEE Tran. , SP-42 , pp. 2781~2794, 1994.
[13] D. Giuliani, M. Omologo and P. Svaizer, "Experoments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaption", Proceeding of international conference on Spoken Language Processing(ICSLP), pp. 1329-1332, October. 1996.
[14] Y. Tamai, S. Kagami, Y. Amemiya and H. Nagashima, "Circular. Microphone Array for Robot’s Audition", Proceedings of the Third. IEEE International Conference on Sensors (SENSORS2004), 2004.
[15] Geert Van Meerbergen, Audio en spraakverwerking. http://homes.esat.kuleuven.be/~gvanmeer/s&a/
[16] Ta-Sung Lee, Tsui-Tsai Lin, "Coherent intefrence suppression with complementally transformed adaptive beamformer", Antennas and Propagation, IEEE Transactions, Vol. 46, Issue 5, pp. 609-617, May 1998.
[17] Gollamudi, S. , Yih-Fang Huang , "Optimally combined nonlinear MMSE beamforming and interference cancellation for CDMA communications", Personal Wireless Communications, 2000 IEEE International Conference, pp. 474-478, 2000.
[18] Pillai, S. Unnikrishna, Array signal processing, 1989.
[19] 陳益正,使用強健性時間延遲與訊號子空間方法於麥克風陣列語音加強,國立成功大學資訊工程研究所,2003。
[20] 張凱行,一個基於子空間追蹤演算法之語音強健系統及其硬體設計,國立成功大學電機工程研究所,2004。
[21] 孫藍蕙,語音活動偵測與適應性聲源定位處理器之設計與實現,國立交通大學電機與控制工程研究所,2006。
[22] Suzuki Y. , Asano F. , H.-Y. Kim , Toshio Sone, "An optimum computer-generated pulse signal suitable for the measurement of very long impulse responses", J. Acoust. Soc. Am. Vol. 97(2) , pp.1119-1123, 1995.
[23] European Digital Cellular Telecommunications System ; Half rate speech part 6 : Voice Activity Detector (VAD) for half rate speech traffic channels ( ETSI GSM 6.42 ) , 1995.
[24] European Digital Cellular Telecommunications System ; Half rate Speech ; Half rate speech transcoding ( ETSI GSM 6.20 ) , 1995.
[25] Aoshima N. “Computer-generated pulse signal applied for sound measurement.” J. Acoust. Soc. Am. Vol. 69, pp. 1484-1488, 1981.
[26] National Semiconductor, LM386 low voltage audio power amplifier datasheet.
http://www.national.com/ds/LM/LM386.pdf
[27] Wikipedia, Low-pass filter.
http://en.wikipedia.org/wiki/Low-pass_filter
[28] National Instruments, Low-cost multifunction DAQ for USB.
http://www.ni.com/pdf/products/us/20043762301101dlr.pdf