網路電話語音品質改進之研究｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳佳新 Jia-Sin Chen
論文名稱：	網路電話語音品質改進之研究 Speech-Quality Improving for Internet Telephony
指導教授：	古鴻炎 Hung-Yan Gu
口試委員:	陳建華 Chien-Hua Chen 黃紹華 Shaw-Hwa Hwang 邱舉明 Ge-Ming Chiu 洪西進 Shi-Jinn Horng
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2005
畢業學年度：	93
語文別：	中文
論文頁數：	96
中文關鍵詞：	網路電話、封包遺失補償、語音品質
外文關鍵詞：	internet telephony, packet loss concealment, speech quality
相關次數：	點閱：167 下載：2
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

為了舒緩封包遺失對於語音品質的影響，本論文分別對網路電話的傳送端與接收端程式，研究有效的補償方法。本論文提出的傳送端補償方法，在不惡化傳送失敗率的情況之下，改變封包傳出失敗的分佈，這對接收端的波形重建處理非常有幫助。實驗結果顯示，若使用我們的方法，連續傳出失敗的發生次數中，個數為1與2所佔之比例，將可由26.01%提高至73.62%，而導入的平均傳出延遲則為2.12個封包處理時間。本論文提出的接收端補償方法，首先會去偵測遺失封包的前一個與後一個封包內的基週長度，接著決定重建出的基週在遺失封包內需要佈放的個數與個別長度，最後內差出各個基週的波形，此方法我們稱之為時間比例式基週波形內差法（TPBPWI）。主觀的聽測評估實驗顯示，我們的方法比雙邊基週波形複製法（DSPWR）、改良式波形相似性疊加法（Modified WSOLA）、相似波形取代法（PM）與過去封包重複法（Rep）等波形重建方法都好。在窄頻寬的撥接網路實驗中，主觀的聽測評估結果也顯示，使用TPBPWI方法可以提升語音品質。

In this thesis, we study to develop efficient methods to mitigate the effects of packet losses in internet telephony. The methods proposed include a transmitter-based PLC (Packet Loss Concealment) method using a packet dropper and a receiver-based PLC method. The proposed transmitter-based PLC method changes the distribution of packet transmission failures. That is very helpful for the receiver to reconstruct the waveform of a lost packet, and the rate of transmission failure is not degrated. When compared with the original method in the ratio of burst packet-loss length of 1 and 2, our method is shown to have higher ratio, i.e. 73.62% vs. 26.01%. Also, the transmission delay induced is only 2.12 packet times. The proposed receiver-based PLC method detects the pitch lengths of the waveform in the two sides, determines the number of pitch cycles and pitch lenghts to be placed into the lost packet, and interpolates the pitch waveforms from the two sides to synthesize pitch waveforms. This method is called Time-Proportion Based Pitch-Waveform Interpolation (TPBPWI). When compared with other waveform reconstruction methods, e.g. Double Sided Pitch Waveform Replication (DSPWR), Modified Waveform Similarity OverLap Add (Modified WSOLA), Waveform Substitution based on Pattern Matching (PM) and Packet Repetition (Rep), our method is shown, in subjective perception tests, to be able to obtain better speech quality than the methods mentioned. In addition, our method, TPBPWI, has been integrated into WinRTP program and tested with a dial-up network of narrow bandwidth. The result of perception-based evaluation shows that our method can obtain significant speech-quality improvement.

摘要	I
ABSTRACT	II
誌謝	III
目錄	IV
圖表索引	VII
第一章  緒論	1
1  研究背景	1
2  研究動機	2
3  研究方法	3
4  論文架構	5
第二章  封包遺失補償方法之回顧	7
1  傳送端（transmitter-based）補償方法	7
1.1  主動式（active）	9
1.2  被動式（passive）	9
1.3  通訊協定（protocol）	10
2  接收端（receiver-based）補償方法	11
2.1  插入式（insertion）	14
2.2  內差式（interpolation）	15
2.3  重建式（regeneration）	16
第三章  傳送端補償方法	18
1  網路狀態偵測	18
2  封包減量傳送方法	20
3  封包遺失之影響	23
第四章  接收端補償方法	27
1  基於雙邊基週波形的作法	27
1.1  DSPWR法回顧	27
1.2  本文的TPBPWI法	30
2  基週偵測	31
2.1  正規化自相關函數	32
2.2  基週偵測演算法	35
2.3  提升偵測正確率	40
3  基週佈放	43
3.1  基週波形之同步與修補	43
3.2  基週個數與長度訂定	46
4  基週波形內差	49
5  波形之振幅調整	54
5.1  前向振幅調整	55
5.2  後向振幅調整	56
6  無聲波形之重建方法	58
第五章  模擬實驗與評估	60
1  基週偵測之正確率	60
2  封包遺失模型	61
3  SNR之比較	63
4  波形重建方法之比較	69
4.1  封包遺失率10%之實驗	69
4.2  封包遺失率30%之實驗	72
4.3  封包遺失率50%之實驗	75
5  聽測評估實驗	78
第六章  補償方法與WinRTP之整合	81
1  WinRTP簡介	81
1.1  WinRTP傳送端處理流程	82
1.2  WinRTP接收端處理流程	83
2  傳送端補償方法與WinRTP之整合	84
3  接收端補償方法與WinRTP之整合	85
4  撥接環境下之實驗與評估	86
4.1  傳送端補償方法之評估	86
4.2  接收端補償方法之評估	89
第七章  結論	91
參考文獻	93
作者簡介	96

                                

[1] W. Jiang and H. Schulzrinne, “Modeling of Packet Loss and Delay and Their Effect on Real-Time Multimedia Service Quality,” NOSSDAV, 2000.
[2] C. Perkins, O. Hodson and V. Hardman, “A Survey of Packet Loss Recovery Techniques for Streaming Audio,” IEEE Network, Vol. 12, No. 5, pp. 40-48, 1998.
[3] W.-T. Liao, J.-C. Chen and M.-S. Chen, “Adaptive Recovery Techniques for Real-Time Audio Streams,” IEEE INFOCOM, Vol. 2, pp. 815-823, 2001.
[4] http://www.vovida.org/applications/downloads/winRTP/.
[5] M. Y. Kim and R. Vafin, “Packet-Loss Recovery Techniques for VoIP,” Technical Report, Royal Institute of Technology (KTH), Sweden.
[6] H. Schulzrinne, S. Casner, R. Frederick and V. Jacobson, “RTP: A Transport Protocol for Real-Time Applications,” RFC 1889, 1996.
[7] C. Perkins and O. Hodson, “Options for Repair of Streaming Media,” RFC 2354, 1998.
[8] ITU-T, “One-way Transmission Time,” Rec. G.114, 2003.
[9] J. Rosenberg and H. Schulzrinne, “An RTP Payload Format for Generic Forward Error Correction,” RFC 2733, 1999.
[10] C. Perkins et al., “RTP Payload for Redundant Audio Data,” RFC 2198, 1997.
[11] S. Casner, “Session Description Protocol (SDP) Bandwidth Modifiers for RTP Control Protocol (RTCP) Bandwidth,” RFC 3556, 2003.
[12] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson, Ed. and G. Fairhurst, Ed., “The Lightweight User Datagram Protocol (UDP-Lite),” RFC 3828, 2004.
[13] R. Braden, Ed., L. Zhang, S. Berson, S. Herzog and S. Jamin, “Resource ReSerVation Protocol (RSVP) -- Version 1 Functional Specification,” RFC 2205, 1997.
[14] 洪瑞聲, “降低VoIP封包時基誤差之設計與研究,” 國立成功大學工程科學系碩士論文, 1999.
[15] R. Zopf, “Real-time Transport Protocol (RTP) Payload for Comfort Noise (CN),” RFC 3389, 2002.
[16] ITU-T, “A Silence Compression Scheme for G.729 Optimized for Terminals Conforming to Recommendation in V.70,” Rec. G.729 Annex B, 1996.
[17] ITU-T, “A Comfort Noise Payload Definition for ITU-T G.711 Use in Packet-based Multimedia Communication Systems,” Rec. G.711 Appendix II, 2000.
[18] ITU-T, “Coding of Speech At 8 kbit/s Using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP),” Rec. G.729, 1996.
[19] ETSI, “Substitution And Muting of Lost Frames for Full Rate Speech Channels,” Rec. GSM 6.11, 1992.
[20] ITU-T, “A High Quality Low-complexity Algorithm for Packet Loss Concealment with G.711,” Rec. G.711 Appendix I, 1999.
[21] D. J. Goodman, G. B. Lockhart, O. J. Wasem and W. C. Wong, “Waveform Substitution Techniques for Recovering Missing Speech Segments in Packet Voice Communications,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 34, No. 6, pp. 1440-1448, 1986.
[22] O. J. Wasem, D. J. Goodman, C. A. Dvorak and H. G. Page, “The Effect of Waveform Substitution on The Quality of PCM Packet Communications,” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 36, No. 3, pp. 342-348, 1988.
[23] W. Verhelst and M. Roelands, “An Overlap-Add Technique Based on Waveform Similarity (WSOLA) for High Quality Time-Scale Modification of Speech,” IEEE ICASSP, Vol. 2, pp. 554-557, 1993.
[24] A. Stenger, K. B. Younes, R. Reng and B. Girod, “A New Error Concealment Technique for Audio Transmission with Packet Loss,” EUSIPCO, 1996.
[25] J. Wang and J. D. Gibson, “Parameter Interpolation to Enhance the Frame Erasure Robustness of CELP Coders in Packet Networks,” IEEE ICASSP, Vol. 2, pp. 745-748, 2001.
[26] J. C. De Martin, T. Unno and V. Viswanathan, “Improved Frame Erasure Concealment for CELP-based Coders,” IEEE ICASSP, Vol. 3, pp. 1483-1486, 2000.
[27] M.-K. Lee, S.-K. Jung, Y.-C. Park and D.-H. Youn, “A Packet Loss Concealment Algorithm Based on Time-Scale Modification for CELP-type Speech Coders,” IEEE ICASSP , Vol. 1, pp. I-116-I-119, 2003.
[28] 林群超, “具語音資料保密與G.729錯誤重建功能之網路電話傳輸技術及其於Network-based影音監控系統之整合設計與驗證,” 國立中正大學資訊工程研究所碩士論文, 2003.
[29] J. C. Bolot, S. Fosse-Parisis and D. Towsley, “Adaptive FEC-Based Error Control for Internet Telephony,” IEEE INFOCOM, Vol. 3, pp. 1453-1460, 1999.
[30] L. Roychoudhuri and E. S. Al-Shaer, “Adaptive Rate Control for Real-time Packet Audio Based on Loss Prediction,” IEEE GLOBECOM, Vol. 2, pp. 634-638, 2004.
[31] 王小川, “語音訊號處理,” 全華科技圖書, 2004.
[32] ITU-T, “Pulse Code Modulation (PCM) of Voice Frequencies,” Rec. G.711, 1988.
[33] R. A. Valenzuela and C. N. Animalu, “A New Voice-Packet Reconstruction Technique,” IEEE ICASSP, Vol. 2, pp. 1334-1336, 1989.
[34] Gu, Hung-Yan and Wen-Lung Shiu, “A Mandarin-syllable Signal Synthesis Method with Increased Flexibility in Duration, Tone and Timbre Control”, Proceedings of the National Science Council, Republic of China, Part A: Physical Science and Engineering, Vol. 22, No. 3, pp. 385-395, 1998.
[35] ITU-T, “Coding of Speech at 16 kbit/s Using Low-Delay Code Excited Linear Prediction,” Rec. G.728, 1992.
[36] Y. J. Liang, N. Färber and B. Girod, “Adaptive Playout Scheduling and Loss Concealment for Voice Communication Over IP Networks,” IEEE Transactions on Multimedia, Vol. 5, No. 4, pp. 532-543, 2003.

簡易檢索 / 詳目顯示

相關論文