研究生: |
陳佳新 Jia-Sin Chen |
---|---|
論文名稱: |
網路電話語音品質改進之研究 Speech-Quality Improving for Internet Telephony |
指導教授: |
古鴻炎
Hung-Yan Gu |
口試委員: |
陳建華
Chien-Hua Chen 黃紹華 Shaw-Hwa Hwang 邱舉明 Ge-Ming Chiu 洪西進 Shi-Jinn Horng |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2005 |
畢業學年度: | 93 |
語文別: | 中文 |
論文頁數: | 96 |
中文關鍵詞: | 網路電話 、封包遺失補償 、語音品質 |
外文關鍵詞: | internet telephony, packet loss concealment, speech quality |
相關次數: | 點閱:167 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為了舒緩封包遺失對於語音品質的影響,本論文分別對網路電話的傳送端與接收端程式,研究有效的補償方法。本論文提出的傳送端補償方法,在不惡化傳送失敗率的情況之下,改變封包傳出失敗的分佈,這對接收端的波形重建處理非常有幫助。實驗結果顯示,若使用我們的方法,連續傳出失敗的發生次數中,個數為1與2所佔之比例,將可由26.01%提高至73.62%,而導入的平均傳出延遲則為2.12個封包處理時間。本論文提出的接收端補償方法,首先會去偵測遺失封包的前一個與後一個封包內的基週長度,接著決定重建出的基週在遺失封包內需要佈放的個數與個別長度,最後內差出各個基週的波形,此方法我們稱之為時間比例式基週波形內差法(TPBPWI)。主觀的聽測評估實驗顯示,我們的方法比雙邊基週波形複製法(DSPWR)、改良式波形相似性疊加法(Modified WSOLA)、相似波形取代法(PM)與過去封包重複法(Rep)等波形重建方法都好。在窄頻寬的撥接網路實驗中,主觀的聽測評估結果也顯示,使用TPBPWI方法可以提升語音品質。
In this thesis, we study to develop efficient methods to mitigate the effects of packet losses in internet telephony. The methods proposed include a transmitter-based PLC (Packet Loss Concealment) method using a packet dropper and a receiver-based PLC method. The proposed transmitter-based PLC method changes the distribution of packet transmission failures. That is very helpful for the receiver to reconstruct the waveform of a lost packet, and the rate of transmission failure is not degrated. When compared with the original method in the ratio of burst packet-loss length of 1 and 2, our method is shown to have higher ratio, i.e. 73.62% vs. 26.01%. Also, the transmission delay induced is only 2.12 packet times. The proposed receiver-based PLC method detects the pitch lengths of the waveform in the two sides, determines the number of pitch cycles and pitch lenghts to be placed into the lost packet, and interpolates the pitch waveforms from the two sides to synthesize pitch waveforms. This method is called Time-Proportion Based Pitch-Waveform Interpolation (TPBPWI). When compared with other waveform reconstruction methods, e.g. Double Sided Pitch Waveform Replication (DSPWR), Modified Waveform Similarity OverLap Add (Modified WSOLA), Waveform Substitution based on Pattern Matching (PM) and Packet Repetition (Rep), our method is shown, in subjective perception tests, to be able to obtain better speech quality than the methods mentioned. In addition, our method, TPBPWI, has been integrated into WinRTP program and tested with a dial-up network of narrow bandwidth. The result of perception-based evaluation shows that our method can obtain significant speech-quality improvement.
[1] W. Jiang and H. Schulzrinne, “Modeling of Packet Loss and Delay and Their Effect on Real-Time Multimedia Service Quality,” NOSSDAV, 2000.
[2] C. Perkins, O. Hodson and V. Hardman, “A Survey of Packet Loss Recovery Techniques for Streaming Audio,” IEEE Network, Vol. 12, No. 5, pp. 40-48, 1998.
[3] W.-T. Liao, J.-C. Chen and M.-S. Chen, “Adaptive Recovery Techniques for Real-Time Audio Streams,” IEEE INFOCOM, Vol. 2, pp. 815-823, 2001.
[4] http://www.vovida.org/applications/downloads/winRTP/.
[5] M. Y. Kim and R. Vafin, “Packet-Loss Recovery Techniques for VoIP,” Technical Report, Royal Institute of Technology (KTH), Sweden.
[6] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, “RTP: A Transport Protocol for Real-Time Applications,” RFC 1889, 1996.
[7] C. Perkins and O. Hodson, “Options for Repair of Streaming Media,” RFC 2354, 1998.
[8] ITU-T, “One-way Transmission Time,” Rec. G.114, 2003.
[9] J. Rosenberg and H. Schulzrinne, “An RTP Payload Format for Generic Forward Error Correction,” RFC 2733, 1999.
[10] C. Perkins et al., “RTP Payload for Redundant Audio Data,” RFC 2198, 1997.
[11] S. Casner, “Session Description Protocol (SDP) Bandwidth Modifiers for RTP Control Protocol (RTCP) Bandwidth,” RFC 3556, 2003.
[12] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson, Ed. and G. Fairhurst, Ed., “The Lightweight User Datagram Protocol (UDP-Lite),” RFC 3828, 2004.
[13] R. Braden, Ed., L. Zhang, S. Berson, S. Herzog and S. Jamin, “Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification,” RFC 2205, 1997.
[14] 洪瑞聲, “降低VoIP封包時基誤差之設計與研究,” 國立成功大學工程科學系碩士論文, 1999.
[15] R. Zopf, “Real-time Transport Protocol (RTP) Payload for Comfort Noise (CN),” RFC 3389, 2002.
[16] ITU-T, “A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to Recommendation in V.70,” Rec. G.729 Annex B, 1996.
[17] ITU-T, “A Comfort Noise Payload Definition for ITU-T G.711 Use in Packet-based Multimedia Communication Systems,” Rec. G.711 Appendix II, 2000.
[18] ITU-T, “Coding of Speech At 8 kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP),” Rec. G.729, 1996.
[19] ETSI, “Substitution And Muting of Lost Frames for Full Rate Speech Channels,” Rec. GSM 6.11, 1992.
[20] ITU-T, “A High Quality Low-complexity Algorithm for Packet Loss Concealment with G.711,” Rec. G.711 Appendix I, 1999.
[21] D. J. Goodman, G. B. Lockhart, O. J. Wasem and W. C. Wong, “Waveform Substitution Techniques for Recovering Missing Speech Segments in Packet Voice Communications,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 34, No. 6, pp. 1440-1448, 1986.
[22] O. J. Wasem, D. J. Goodman, C. A. Dvorak and H. G. Page, “The Effect of Waveform Substitution on The Quality of PCM Packet Communications,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 36, No. 3, pp. 342-348, 1988.
[23] W. Verhelst and M. Roelands, “An Overlap-Add Technique Based on Waveform Similarity (WSOLA) for High Quality Time-Scale Modification of Speech,” IEEE ICASSP, Vol. 2, pp. 554-557, 1993.
[24] A. Stenger, K. B. Younes, R. Reng and B. Girod, “A New Error Concealment Technique for Audio Transmission with Packet Loss,” EUSIPCO, 1996.
[25] J. Wang and J. D. Gibson, “Parameter Interpolation to Enhance the Frame Erasure Robustness of CELP Coders in Packet Networks,” IEEE ICASSP, Vol. 2, pp. 745-748, 2001.
[26] J. C. De Martin, T. Unno and V. Viswanathan, “Improved Frame Erasure Concealment for CELP-based Coders,” IEEE ICASSP, Vol. 3, pp. 1483-1486, 2000.
[27] M.-K. Lee, S.-K. Jung, Y.-C. Park and D.-H. Youn, “A Packet Loss Concealment Algorithm Based on Time-Scale Modification for CELP-type Speech Coders,” IEEE ICASSP , Vol. 1, pp. I-116-I-119, 2003.
[28] 林群超, “具語音資料保密與G.729錯誤重建功能之網路電話傳輸技術及其於Network-based影音監控系統之整合設計與驗證,” 國立中正大學資訊工程研究所碩士論文, 2003.
[29] J. C. Bolot, S. Fosse-Parisis and D. Towsley, “Adaptive FEC-Based Error Control for Internet Telephony,” IEEE INFOCOM, Vol. 3, pp. 1453-1460, 1999.
[30] L. Roychoudhuri and E. S. Al-Shaer, “Adaptive Rate Control for Real-time Packet Audio Based on Loss Prediction,” IEEE GLOBECOM, Vol. 2, pp. 634-638, 2004.
[31] 王小川, “語音訊號處理,” 全華科技圖書, 2004.
[32] ITU-T, “Pulse Code Modulation (PCM) of Voice Frequencies,” Rec. G.711, 1988.
[33] R. A. Valenzuela and C. N. Animalu, “A New Voice-Packet Reconstruction Technique,” IEEE ICASSP, Vol. 2, pp. 1334-1336, 1989.
[34] Gu, Hung-Yan and Wen-Lung Shiu, “A Mandarin-syllable Signal Synthesis Method with Increased Flexibility in Duration, Tone and Timbre Control”, Proceedings of the National Science Council, Republic of China, Part A: Physical Science and Engineering, Vol. 22, No. 3, pp. 385-395, 1998.
[35] ITU-T, “Coding of Speech at 16 kbit/s Using Low-Delay Code Excited Linear Prediction,” Rec. G.728, 1992.
[36] Y. J. Liang, N. Färber and B. Girod, “Adaptive Playout Scheduling and Loss Concealment for Voice Communication Over IP Networks,” IEEE Transactions on Multimedia, Vol. 5, No. 4, pp. 532-543, 2003.