簡易檢索 / 詳目顯示

研究生: 朱玟霖
Wen-lin Chu
論文名稱: 以演算法搜尋喉閃頻內視鏡動態影像技術於聲門影像的自動化分析辨識系統之研究
Using Algorithm to Search for the Dynamic Images of Strobe-Laryngoscope in Automated Analysis Identification System for Glottis Images
指導教授: 郭中豐
Chung-feng jeffrey Kuo
口試委員: 黃昌群
Chang-chiun Huang
朱永祥
Yu-hsiang Chu
王興萬
Hsing-wan Wang
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2011
畢業學年度: 99
語文別: 中文
論文頁數: 108
中文關鍵詞: 動態影像處理聲門週期波形灰階適應性熵值區域分割支持向量機
外文關鍵詞: Dynamic Image Process, Glottis Area Waveform, Gray Level Adaptive Entropy Value, Area Segmentation, Support Vector Machine
相關次數: 點閱:170下載:13
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 人體於用力呼吸時聲門張開至最大位置,而發音時聲門閉合至最小位置,因此對聲帶聲門做偵測時,擷取出聲門張開至最大位置與聲門閉合至最小位置以此為判斷依據。目前為止醫院仍採用人工方式挑選以喉閃頻內視鏡(Strobe-laryngoscope)攝影動態影像之聲門張開至最大位置與聲門閉合至最小位置之靜態影像,故自動化辨識動態影像輔助醫療系統為重要之需求。
    本研究探討220組由喉閃頻內視鏡所攝影動態影像與30組加入雷射投標記模組所攝影動態影像,利用色彩空間之轉換與影像前處理自動篩選出聲門張開至最大位置與聲門閉合至最小位置之靜態影像,並量化聲門面積。由於所拍攝之影像並非固定不動,舌跟運動,內視鏡晃動以及唾液會造成非清晰之影像,因此本研究提出以灰階適應性熵值設定閥值,建立淘汰制度提升自動擷取聲門張開至最大位置與聲門閉合至最小位置影像之效果,其準確率可達96%。此外以各種影像前處理有效地計算聲門面積與區域分割閥值,準確聲門區域分割,自動化繪製聲門週期波形(Glottis Area Waveform, GAW)圖,以此為聲帶健康輔助依據。
    於智慧型聲帶疾病辨識系統方面,本研究針對之聲帶分別為正常聲帶(Normal)、聲帶麻痺(Vocal Fold Paralysis)、聲帶息肉(Vocal Fold Polyp)及聲帶囊腫(Vocal Fold Cyst)四種不同類型之疾病,求取其特徵值且應用支持向量機( Support Vector Machine, SVM) 分類器做為辨識聲帶疾病之相關研究,辨識準確率可達98.75%。其結果可提供為醫師診斷病患病情之參考。


    Human glottis opens to the largest extent when individuals breathe with maximal effort and closes to the smallest extent when individuals make voice. Therefore, the images of largest extent and the smallest extent of glottis are captured during video laryngoscope as the basis of identification of possible vocal fold lesion when the glottis is being examined. At present, doctors still select the static images of glottis opening to the largest extent and closing to the smallest extent in video clip recorded by strobe-laryngoscope manually, so the automatic recognition of dynamic image aided medical system is an important demand.
    This study analyzed 220 groups of dynamic images photographed by strobe-laryngoscope and 30 groups of dynamic images photographed with additional laser marking module. The static images of glottis opening to the largest extent and closing to the smallest extent were filtered out automatically by using color space conversion and image preprocessing and the glottal area was quantized. As the tongue base movement affected the position of laryngoscope and saliva would result in unclear images, this study puts forward setting the threshold by using gray level adaptive entropy value, setting up an elimination system to improve the effect of automatically captured images of glottis opening to the largest extent and closing to the smallest extent. The accuracy rate can be 96%. In addition, the glottal area and region segmentation threshold were calculated effectively by using various image preprocessing. The vocal area segmentation was corrected, and the glottis area waveform (GAW) pattern was drawn automatically to assist in vocal fold health.
    In the aspect of intelligent recognition system for vocal folds disorders, this study aims at four kinds of vocal folds such as normal vocal fold, vocal fold paralysis, vocal fold polyp and vocal fold cyst. By analyzing the characteristic eigenvalues of these four vocal folds patterns and the support vector machine (SVM) classifier to identify vocal folds disorders, the identification accuracy rate can be 98.75%. The results can be provided as assistance for doctors to diagnose the patient's conditions.

    中文摘要 II 英文摘要 IV 誌謝 VI 圖表索引 X 第1章 緒論 1 1.1 研究動機與目的 1 1.2 文獻回顧探討 3 1.3 論文架構及其研究流程圖 9 第2章 人類聲帶的基本構造與常見疾病 11 2.1 人類聲帶的基本構造 11 2.2 常見之疾病與診斷方法 15 第3章 數位影像處理概述 19 3.1 影像前處理之基本理論 19 3.1.1 色彩空間之轉換 20 3.1.2 影像之灰階轉換 22 3.1.3 影像之對比強化 23 3.1.4 影像之閥值分割 26 3.1.5 影像之區域填充 27 3.1.6 影像之區域標記連通 28 3.1.7 影像之清晰度之偵測 29 3.2 影像相似度比對之基本理論 30 3.3 影像特徵值之擷取 32 3.3.1 靜態擷取之特徵值 33 3.3.2 動態擷取之特徵值 37 3.3.3 尺度轉換方式 38 3.4 定義各影像前處理技術 39 第4章 支持向量機演算法 40 4.1 線性可分割(Linearly Separable) 41 4.2 非線性可分割(Non-linearly Separable) 45 第5章 實驗結果與討論 47 5.1 實驗設備介紹 47 5.1.1 硬體架構 47 5.1.2 軟體開發 52 5.2 實驗分析 53 5.2.1 實驗搜尋聲帶開大閉小動態影像演算法 55 5.2.2 實驗抗模糊動態影像搜尋聲帶門開與閉合動態影像演算法 63 5.2.3 尺度量化聲門面積於搜尋聲帶開大閉小動態影像演算法 72 5.2.4 自動化偵測動態影像聲門面積之GAW圖實驗 83 5.2.5 以SVM自動化辨識聲帶疾病 89 第6章 結論 93 參考文獻 96

    [1] Eysholdt, U., Rosanowski, F., Hoppe, U., “Vocal Fold Vibration Irregularities Caused by Different Types of Laryngeal Asymmetry”, Eur Arch of Otorhinolaryngol, Vol. 260, No. 8, pp. 412-417 (2003).
    [2] Döllinger. M., Braunschweig, T., Lohscheller, J., Eysholdt, U., Hoppe, U., “Normal Voice Production: Computation of Driving Parameters from Endoscopic Digital High Speed Images”, Schattauer GmbH, pp. 271-276 (2003).
    [3] Yuling, Y., Xin, C ., and Bless, D., “Automatic Tracing of Vocal-Fold Motion From High-Speed Digital Images”, IEEE Transactions on Biomedical Engineering, Vol. 53, No. 7, pp. 1394-1400 (2006).
    [4] Skalski, A., Zielinski, T., “Analysis of Vocal Folds Movement in High Speed Videoendoscopy Based on Level Set Segmentation and Image Registration”, International Conference on Signals And Electronic Systems, pp. 223-226 (2008).
    [5] Yuling, Y., Bless, D., and Xin, C., “Biomedical Image Analysis in High-Speed Laryngeal Imaging of Voice Production”, IEEE Engineering in Medicine and Biology, Vol. 53, No. 27, pp. 7684-7687 (2005).
    [6] Verikas, A., Gelzinis, A., Bacauskiene, M., Valincius, D., Uloza, V., “A Kernel-Based Approach to Categorizing Laryngeal Images”, Computerized Medical Imaging and Graphics, Vol. 31, No. 8, pp. 587-594 (2007).
    [7] Méndez, A., García, B., Vicente, J., Ruiz, I., and Sánchez, K., “Objective Model of Vocal Folds, Based on Glottal Closure,Opening and Morphologic Criteria”, IEEE Engineering in Medicine and Biology, Vol. 53, No. 7, pp. 4244-4250 (2007).
    [8] Bresch, E., Narayanan, S., “Region Segmentation in the Frequency Domain Applied to Upper Airway Real-Time Magnetic Resonance Images”, IEEE Transactions on Medical Imaging, Vol. 28, No. 3, pp. 323-338 (2009).
    [9] Marendicl, B., Galatsanos, N., and Bless, D., “A New Active Contour Algorithm For Tracking Vibrating Vocal Folds”, International Conference on Image Processing, Thessaloniki, Vol. 1, pp. 397-400 (2001).
    [10] Allin, S., Galeotti J., Stetten, G., Dailey, S. H., “Enhanced Snake Based Segmentation of Vocal Folds”, International Symposium on Biomedical Imaging: Mano to Macro, Vol. 1, pp. 812-815 (2004).
    [11] Yuling, Y., Xin, C., and Diane, B., “Automatic Tracing of Vocal-Fold Motion From High-Speed Digital Images”, Proceedings of the IEEE Transactions on Biomedical Engineering, Vol.53, No.7, pp. 1394-1400 (2006).
    [12] Mendez, A., Garcia, B., Ruiz, I., Iturricha, I., “Glottal Area Segmentation Without Initialization Using Gabor Filters”, International Conference on Signal Processing and Information Technology, Sarajevo, pp. 18-22 (2008).

    [13] Xin. C., Bless, D., and Yan, Y., “A Segmentation Scheme Based on Rayleigh Distribution Model for Extracting Glottal Waveform from High-speed Laryngeal Images”, IEEE Engineering in Medicine and Biology, Vol. 23, No. 6, pp. 6269-6272 (2005).
    [14] Méndez Zorrilla, A., García Zapirain, B., “Vocal Folds Paralysis Study Using a Pre-processing Stage of Gabor Filtering and Chan-Vese Segmentation”, IEEE International on Signal Processing and Information Technology, pp. 360-365 (2011).
    [15] Behroozmand, R., and Almasganj, F., “Comparison of Neural Networks and Support Vector Machines Applied to Optimized Features Extracted from Patients’ Speech Signal for Classification of Vocal Fold Inflammation”, IEEE International Symposium on Signal Processing and Information Technology, pp. 844-849 (2004).
    [16] Bacauskiene, M., Verikas, A., Gelzinis, A., Valincius, D., “A Feature Selection Technique for Generation of Classification Committees and Its Application to Categorization of Laryngeal Images”, Pattern Recognition42, pp. 645-654 (2009).
    [17] Bierbraurer, J., Introduction to Coding Theroy, Chapman and Hall, pp. 101-110 (2005).
    [18] Trappe, W., and Washington, C., Introduction to Cryptography: with Coding Theory, Prentice Hall, pp. 328-332 (2006).
    [19] 張斌編著,“耳鼻喉科學”,中正書局(1996)。
    [20] 譚慶鼎等編著,“喉部疾病-耳鼻喉外科進階”,財團法人杜詩綿教授學術基金會(1999)。
    [21] Ossoff, R. H., Shapshay, S. M., Woodson, G. E., Netterville, J. L., “The Larynx, Lippincott Williams & Wilkins”, PA, pp. 33-51 & pp. 338-377 (2003).
    [22] Fried, M. P., Ferlito, A., “The Larynx, Plural Publishing”, OX, pp. 85-99 (2009).
    [23] Vapnik, V.N. “Statistical Learning Theory”, Hardcover, John Wiley &Sons, Inc., New York (1998).
    [24] Flecher, R., “Practical Methods of Optimization”, John Wiley &Sons, Inc.,2nd edition (1987).
    [25] Gale, D., Kuhn, H. W., & Tucker, A. W., “Linear Programming and the Theory of GAMES”, Activity Analysis of Production and Allocation, pp. 317-329(1951).
    [26] Bernhard, E. B., Isabelle, M. G., and Vladimir, N. V., “A Training Algorithm for Optimal Margin Classifiers”, the fifth Annual Workshop on Computational Learning Theory Pittsburgh, Pennsylvania, United States(1992).
    [27] Deng, Y., Manjunanth, B. S., “Unsupervised Segmentation of Colortexture Regions in Images and Video”, IEEE Transactions on Pattern Anal. March, vol.23, issue 8, pp. 800-810(2001).

    [28] Haralick, R. M., Shanmugam, K., and Dinstein, I.. H., “Textural Features for Image Classification”, IEEE Transactions on Systems, Man and Cybernetics, vol. 3, pp. 610-621(1973).
    [29] 繆紹綱 編著,“數位影像處理 活用-Matlab”,全華科技圖書(2001)。
    [30] 連國珍 編著,“數位影像處理”,第二版,儒林圖書公司(1999)。

    QR CODE