簡易檢索 / 詳目顯示

研究生: 陳冠瑜
KUAN-YU CHEN
論文名稱: 基於深度學習技術實現自動追蹤耳膜與發炎區域之中耳炎輔助診斷系統
Design and Implementation of Deep Learning Based Automatic Tracking Function in Eardrum and Otitis Media Technology for Computer Asisted Diagnosis System
指導教授: 郭中豐
Chung-Feng Kuo
口試委員: 郭中豐
JHONG-FONG KUO
黃昌群
CHANG-CYUN HUANG
劉紹正
LIOU,SHAO-JHENG
邱錦勳
JIN-SYUN CIOU
學位類別: 碩士
Master
系所名稱: 工程學院 - 材料科學與工程系
Department of Materials Science and Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 130
中文關鍵詞: 中耳炎急性中耳炎深度學習卷積神經網路語義分割疾病檢測
外文關鍵詞: tympanitis, acute otitis media, deep learning, convolutional neural network, semantic segmentation, disease testing
相關次數: 點閱:178下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 中耳炎是中耳部位受到細菌感染所導致的流行的上呼吸道感染炎症,好發於8歲以下、學齡前的兒童,及各年齡層。可區分為急性中耳炎、積液性中耳炎與慢性中耳炎。細菌感染時中耳內腔壓力會急劇增加而產生疼痛,有時會因此化膿腫脹嚴重,持續性發炎、流膿,如未治療會造成聽覺神經永久受損以及耳膜破損無法癒合。本研究旨在利用深度學習卷積神經網路與影像處理技術,建立分類系統,藉此來實現出全自動中耳炎輔助診斷系統,診斷預測中耳發炎情況、分類中耳炎的疾病,快速找出耳內鏡影像內的五種部位以及發炎種類,協助醫師臨床診斷之輔助。
    本研究分為三部分,(一)利用深度學習卷積神經網路中的語義分割,使用三軍總醫院400張病患電腦斷層掃描資料,與醫師協同確實標註影像,訓練出全自動分割辨識中耳影像部位並擷取影像特徵參數,對未標註之資料進行驗證,Dice評估模型效能達到81.68%,且預測100組病患資料平均每一組僅需15秒。此外請三位三軍總醫院的耳鼻喉科醫師做交叉驗證,對50組不同病患之耳鏡影像進行測試集考核,比對的位置分成外耳道、耳膜、鬆弛部、錘骨柄、鼓環,平均準確度高達87.01%,顯示語義分割模型能精準分割辨識中耳區域輪廓,僅在少量錘骨炳邊界區域造成誤判。(二)建立人工神經網路架構,把選擇的特徵、交織訊息透過權重更迭傳遞給分類器辨識訓練,以達到分析中耳耳膜及其他區域之疾病屬性,分類準確率依混淆矩陣顯示平均達到92.65%。(三)建立分類系統,使遇到新一筆資料時,可判定屬於五類中的何種類別;依語意分割部位之順序為外耳道準確率為99.92%、耳膜98.51%、鬆弛部95.41%與鼓環95.41%以及錘骨炳部位為97.55%位,達成實現中耳炎輔助診斷系統。關鍵字:中耳炎、急性中耳炎、深度學習、卷積神經網路、語義分割、疾病檢測。


    Tympanitis is a prevalent upper respiratory tract infection inflammation induced by bacterial infection of the middle ear region. It is prone to occur in persons under 8 years old, preschool children, and all age groups. It can be divided into acute otitis media, otitis media with effusion, and chronic otitis media. The middle ear cavity pressure rises rapidly when there is a bacterial infection, leading to pain, sometimes to a serious abscess. When persistent inflammation and suppuration are not treated, the auditory nerve will be permanently damaged, and the eardrum will be ruptured and cannot heal. This study aims to use deep learning convolutional neural networks and image processing technique to create a classification system. By implementing a full automatic auxiliary diagnosis system for tympanitis, diagnoses and predictions of inflammation of the middle ear can be conducted to classify the disease of tympanitis. Furthermore, the auxiliary diagnosis system can help in finding five regions and inflammation types in the otoscope image rapidly to assist doctors in clinical diagnoses.
    This study was divided into three parts. (1) The semantic segmentation of deep learning convolutional neural network was used to positively label the 400 patients' otoscope data from Tri-Service General Hospital with collaboration from doctors. A full automatic segmentation and recognition program for the middle ear region in an image was trained. Afterwards, the image feature parameters were extracted, and the unlabeled data were employed for verification. The Dice evaluated model efficiency is 81.68%, and the prediction for each of the 100 groups of patient data only took 15 secs on average. Additionally, three otorhinolaryngologists from Tri-Service General Hospital were invited for cross-validation. A total of 50 groups of patients' otoscope images were used for test set examination. The positions adopted for the comparison included the external auditory canal, eardrum, pars flaccida, handle of malleus, and tympanic ring. The average accuracy was as high as 87.01%, meaning that the semantic segmentation model could accurately segment and identify the contour of the middle ear region. A small amount of misrecognition occurred in the boundary region of handle of malleus. (2) The artificial neural network architecture was created. The selected features and interlaced information were transferred to the classifier for recognition and training by weight alternation, so as to analyze the disease attributes of the middle eardrum and other regions. The average classification accuracy rate was 92.65% according to a confusion matrix. (3) A classification system was created so that a new piece of data can be identified as one of the five classes. The order of accuracy according to semantic segmentation is the external auditory canal region at 99.92%, eardrum at 98.51%, pars flaccida at 95.41%, tympanic ring at 95.41%, and handle of malleus at 97.55%. Hence, the goal of building an auxiliary diagnosis system for tympanitis diagnosis has been accomplished.

    Keywords: tympanitis; acute otitis media; deep learning; convolutional neural network; semantic segmentation; disease testing

    目錄 摘要 I ABSTRACT III 致謝 V 目錄 VII 圖目錄 XI 表目錄 XV 第1章 緒論 1 1.1研究背景與動機 1 1.2文獻回顧 3 1.2.1臨床中耳炎評估方法 3 1.2.2深度學習應用醫學領域 4 1.2.3中耳區域分割 7 1.2.4中耳炎發炎程度 8 1.3研究目的 9 1.4論文架構 10 第2章 相關醫學知識介紹 13 2.1 耳朵的構造與功能 13 2.2 中耳感染與中耳炎 15 2.3 評估中耳炎疾病 16 2.3.1設備要求與實驗環境 16 2.3.2評估中耳炎的方法 17 2.3.3中耳炎的病症 17 第3章 研究方法與理論 22 3.1 影像前處理 22 3.1.1資料擴增 22 3.1.2高斯模糊 22 3.1.3噪聲注入 23 3.1.4幾何變換 24 3.2 色彩空間轉換 25 3.3.1RGB色彩空間 25 3.3.4HSV色彩空間 25 3.3 深度學習 26 3.3.1機器學習與深度學習 26 3.3.2異損失函數 28 3.4 CNN卷積神經網路 30 3.4.1卷積層 30 3.4.2激勵函數 31 3.4.3池化層 33 3.4.4歸一化 33 3.4.5連接層 34 3.4.6優化器 35 3.5 交叉驗證 36 3.6 影像特徵 37 3.6.1邊緣偵測 38 3.6.2特徵點的選取 40 3.6.3色相與紋理 41 3.6.4灰度共生矩陣 42 3.6.5特徵提取及特徵值計算 43 第4章 實驗與驗證 46 4.1 實驗設備與樣本環境 47 4.2 資料收集和預處理 48 4.2.1分配資料集 49 4.2.2數據增強影像預處理 51 4.3 影像資料擴增 51 4.3.1高斯模糊(Gaussian Blur) 53 4.3.2噪聲注入(Noise Injection) 54 4.3.3裁剪(Cropping) 55 4.4 中耳區域輪廓分割 56 4.4.1 損失函數選擇 58 4.4.2 語義分割模型訓練 60 4.4.3 效能評估指標 66 4.5 中耳炎疾病分類器 71 4.5.1色相與紋理分析 73 4.5.2分類器的演算法選擇 77 4.5.3分類器模型訓練 82 4.5.4自動分類中耳炎病症 86 第5章 實驗結果與分析 92 5.1 中耳分割效能分析 92 5.2 中耳炎疾病分類效能分析 98 5.3 結果討論 100 第6章 結論 102 參考文獻 104

    [1] Klein, J. O. (2015). Otitis externa, otitis media, and mastoiditis. Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases, p. 767.
    [2] Clipper, B. (2020). The influence of the COVID-19 pandemic on technology: Adoption in health care. Nurse Leader, Vol. 18(5), p. 500-503.
    [3] Wang P. C., Chang YH, Chuang LJ, et al. (2011) Incidence and recurrence of acute otitis media in Taiwan's pediatric population. Clinics, p. 395-399.
    [4] Parekh, J., Pankratov, M. M., Poe, D. et al. (1993). Medical requirements for design of endoscopes for otolaryngology. In Lasers in Otolaryngology, Dermatology, Tissue Welding, Vol. 1876, p. 42-48.
    [5] Sharma, N., and Aggarwal, L. M. (2010). Automated medical image segmentation techniques. Journal of Medical Physics, Vol. 35, p. 3.
    [6] Siddiq, S., and Joe, G. (2015). The diagnosis and management of acute otitis media: American academy of pediatrics guidelines 2013. Archives of Disease in Childhood-Education and Practice, Archives of Disease in Childhood-Education and Practice, Vol. 100(4), p. 193-197.
    [7] Eavey, R. D., Stool, S. E., Peckham, G. J., et al. (1976). How to examine the ear of the neonate: careful monitoring can recognize problems early. Clinical Pediatrics, Vol. 15(4), p. 338-341.
    [8] Liang Jiaguang. (2004). Diagnosis of otitis media. Taiwan Journal of Otolaryngology Head and Neck Surgery, Vol. 39, p. 9-11.
    [9] You, P. , Chahine, S., and Husein, M. (2017). Improving learning and confidence through small group, structured otoscopy teaching: a prospective interventional study. Journal of Otolaryngology-Head and Neck Surgery, Vol. 46(1), p. 1-6.
    [10] Chang, P. , and Pedler, K. (2005). Ear examination: a practical guide. Australian Journal of General Practice, Vol. 34(10), p. 857.
    [11] Higgins, J., A., Raman, M., Beaumont, et al. (2019). A survey comparison of educational interventions for teaching pneumatic otoscopy to medical students. BMC Medical Education, Vol. 19, p. 1-7.
    [12] Ponka, David, and Faisal Baddar (2013). Pneumatic otoscopy. Canadian Family Physician, Vol. 59, p. 962.
    [13] Pichichero, M. E. (2000). Acute otitis media: Part 1. Improving diagnostic accuracy. American Family Physician, Vol. 61(7), p. 2051.
    [14] Stefnisdóttir, S., Sikorska-Senoner, A. E., Ásgeirsson, E. I., et al. D. C. (2021). Advantages of metaheuristics for multi-dataset calibration of hydrological models. In EGU General Assembly Conference Abstracts, p. 1395.
    [15] Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, Vol. 65, p. 386.
    [16] Albawi, S., Mohammed, T. A., and Al-Zawi, S. (2017). Understanding of a convolutional neural network. 2017 International Conference on Engineering and Technology (ICET). p. 1-6.
    [17] Snyder, W. E., and Qi, H. (2004). Machine vision. Vol. 1. Cambridge University Press.
    [18] Gu, J., Wang, Z., Kuen, J., et al. (2018). Recent advances in convolutional neural networks. Pattern Recognition, Vol. 77, p. 354-377.
    [19] Taha, B., and Shoufan, A. (2019). Machine learning-based drone detection and classification: State-of-the-art in research. IEEE Access, Vol. 7, p. 13669-13626.
    [20] Venot, A., and Leclerc, V. (1984). Automated correction of patient motion and gray values prior to subtraction in digitized angiography. IEEE Transactions on Medical Imaging, Vol. 3, p. 179-186.
    [21] Torre, V., and Poggio, T. A. (1986). On edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 2, p. 147-163.
    [22] Zahn, C. T. (1971). Graph-theoretical methods for detecting and describing gestalt clusters, IEEE Transactions on Computers, Vol. 100, p. 68-86.
    [23] Kaur, C., and Garg, U. (2021). Artificial intelligence techniques for cancer detection in medical image processing: A review. Materials Today: Proceedings.
    [24] 陳慧珊(2010)。《中耳電腦斷層影像半自動重建技術》。中國醫藥大學臨床醫學研究所,台中,碩士論文。
    [25] Fossel, E.T., Justine M. C. and Jan M. (1986). Detection of malignant tumors. New England Journal of Medicine, Vol. 315(22). p. 1369-1376.
    [26] Monti, C. B., Codari, M., van Assen, M., et al. (2020). Machine learning and deep neural networks applications in computed tomography for coronary artery disease and myocardial perfusion. Journal of Thoracic Imaging, Vol. 35, p. 58-65.
    [27] Sun, Y. (1989). Automated identification of vessel contours in coronary arteriograms by an adaptive tracking algorithm. IEEE Transactions on Medical Imaging, Vol. 8(1), p. 78-88.
    [28] Byra, M., Styczynski, G., and Szmigielski, C.,(2018). Transfer learning with deep convolutional neural network for liver steatosis assessment in ultrasound images. International Journal of Computer Assisted Radiology and Surgery, Vol. 13(12), p. 1895-1903.
    [29] Russakovsky O, Deng J, Su H, et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, Vol. 115, p. 211-252.
    [30] Chen LC, Zhu Y, Papandreou G, et al. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision, p. 801-818.
    [31] Yang, K., Yau, J. H., Fei F., et al. (2022). A study of face obfuscation in imagenet. In International Conference on Machine Learning. PMLR, p. 25313-25330.
    [32] Ibekwe, T. S., Adeosun, A. A., and Nwaorgu, O. G. (2009). Quantitative analysis of tympanic membrane perforation: a simple and reliable method. The Journal of Laryngology and Otology, Vol. 123(1), p. 211.
    [33] Comunello, E., von Wangenheim, A., Junior, V. H. et al. (2009). A computational method for the semi-automated quantitative analysis of tympanic membrane perforations and tympanosclerosis. Computers in Biology and Medicine, Vol. 39(10), p. 889-895.
    [34] Chou, C. N., Shie, C. K., Chang, F. C., et al. (2019). Representation learning on large and small data. Big Data Anal. Large-Scale Multimed. Search. Wiley, Hoboken, p. 3-30.
    [35] Shie, C. K., Chang, H. T., Fan, F. C., et al. (2014). A hybrid feature-based segmentation and classification system for the computer aided self-diagnosis of otitis media. 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology, p. 4655-4658.
    [36] Uçar, M., Akyol, K., and Atila U. (2022). Classification of different tympanic membrane conditions using fused deep hypercolumn features and bidirectional LSTM. IRBM, Vol. 43, p. 187-197.
    [37] Long, J., Shelhamer, E., and Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 3431-3440.
    [38] Zhao, H., Shi, J., Qi, X., et al. (2017). Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2881-2890.
    [39] Chen, L. C., Papandreou, G., Kokkinos, I., et al. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, p. 834-848.
    [40] Khan, M. A., Kwon, S., Choo, J., et al. (2020). Automatic detection of tympanic membrane and middle ear infection from oto-endoscopic images via convolutional neural networks. Neural Networks, Vol. 126, p. 384-394.
    [41] Ganiats, T. G., Lieberthal, A. S., Culpepper, L., et al. (2004). A joint clinical practice guideline for acute otitis media. American Family Physician, Vol. 69(11), p. 2537.
    [42] Alberti, P. W. (2001). The anatomy and physiology of the ear and hearing. Occupational Exposure to Noise: Evaluation, p. 53-62.
    [43] Swartz, J. D. (1984). Cholesteatomas of the middle ear. Diagnosis, etiology, and complications. Radiologic Clinics of North America, Vol. 22, p. 15-35.
    [44] Moser, C., and Kleinplatz, P. J. (2002). Transvestic fetishism: Psychopathology or iatrogenic artifact. New Jersey Psychologist, Vol. 52, p. 16-17.
    [45] Rovers, M. M., Schilder, A. G.,and Zielhuis (2004). Otitis media. The lancet, Vol. 363, p. 465-473.
    [46] Chun, S. H., Lee, D. W., and Shin, J. K. (2010). A clinical study of traumatic perforation of tympanic membrane, Seoul, Korea: Department of Otolaryngology. Hanil Gen Hosp, Vol. 113, p. 679-86.
    [47] Griffin Jr, W. L. (1979). A retrospective study of traumatic tympanic membrane perforations in a clinical practice. The Laryngoscope, Vol. 89, p. 261-282.
    [48] Nomura, Y. (1982). A Needle Otoscope: An Instrument ofEndotoscopy of the Middle Ear. Acta oto-laryngologica, Vol. 93(6), p. 73-79.
    [49] Bhattacharyya, S. (2011). A brief survey of color image preprocessing and segmentation techniques. Journal of Pattern Recognition Research, 1(1), p. 120-129.
    [50] Perez, L., and Wang, J. (2017). The effectiveness of data augmentation in image classification using deep learning. ArXiv Preprint ArXiv, p. 1712.
    [51] Russo, F. (2003). A method for estimation and filtering of Gaussian noise in images. IEEE Transactions on Instrumentation and Measurement, Vol. 52(4), p. 1148-1154.
    [52] Hussain, Z., Gimenez, F., Yi, D., & Rubin, D. (2017). Differential data augmentation techniques for medical imaging classification tasks. In AMIA Annual Symposium Proceedings, Vol. 2017, p. 979
    [53] Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Communications of the ACM, Vol. 60(6), p. 84-90.
    [54] Alom, M. Z., Taha, T. M., Yakopcic, C., et al. (2019). A state-of-the-art survey on deep learning theory and architectures. Electronics, Vol. 8(3), p. 292.
    [55] Wu, J. N. (2016). Compression of fully-connected layer in neural network by kronecker product. In 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI), p. 173-179
    [56] Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21(2), p. 137-146.
    [57] Kanopoulos, N., Vasanthavada, N., and Baker, R. L. (1988). Design of an image edge detection filter using the Sobel operator. IEEE Journal of solid-state circuits, Vol. 23(2), p. 358-367.
    [58] Haralick, R. M., Shanmugam, K., and Dinstein, I. H. (1973). Textural features for image classification. IEEE Transactions on systems, man, and cybernetics, Vol. (6), p. 610-621.
    [59] Russakovsky, O., Deng, J., Su, H., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, Vol. 115(3), .p. 211-252.
    [60] Long, J., and Shelhamer, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 3431-3440.
    [61] Zhao, H., Shi, J., Qi, X., et al. (2017). Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2881-2890.
    [62] He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 770-778.
    [63] Sun, K., Xiao, B., Liu, D., et al. (2019). Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 5693-5703
    [64] Stokes, M. (1996). A standard default color space for the internet-srgb. In Color and Imaging Conference, Vol 1, pp. 238-245.
    [65] Muerle, J. L. (1970). Some thoughts on texture discrimination by computer, Picture Processing and Psychopictorics, p.371-379.
    [66] Tuceryan, M., and Jain, A. K. (1993). Texture analysis. Handbook of Pattern Recognition and Computer Vision, p. 235-276.
    [67] C Shahbahrami, A., Pham, T. A., and Bertels, K. (2012). Parallel implementation of Gray Level Co-occurrence Matrices and Haralick texture features on cell architecture. The Journal of Supercomputing, Vol. 59(3), 1455-1477.
    [68] Haralick, R.M. and L. (1981). Watson, A facet model for image data. Computer Graphics and Image Processing, Vol. 15(2), p. 113-129.
    [69] Mutiullah, A., Sabir, M., Naveed, S., and Bari, M. (2019). Lungs cancer detection using digital image processing techniques: a review. Mehran University Research Journal of Engineering and Technology, Vol. 38(2), p. 351-360.

    無法下載圖示 全文公開日期 2024/09/29 (校內網路)
    全文公開日期 2024/09/29 (校外網路)
    全文公開日期 2024/09/29 (國家圖書館:臺灣博碩士論文系統)
    QR CODE