研究生: |
陳冠瑜 KUAN-YU CHEN |
---|---|
論文名稱: |
基於深度學習技術實現自動追蹤耳膜與發炎區域之中耳炎輔助診斷系統 Design and Implementation of Deep Learning Based Automatic Tracking Function in Eardrum and Otitis Media Technology for Computer Asisted Diagnosis System |
指導教授: |
郭中豐
Chung-Feng Kuo |
口試委員: |
郭中豐
JHONG-FONG KUO 黃昌群 CHANG-CYUN HUANG 劉紹正 LIOU,SHAO-JHENG 邱錦勳 JIN-SYUN CIOU |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 材料科學與工程系 Department of Materials Science and Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 130 |
中文關鍵詞: | 中耳炎 、急性中耳炎 、深度學習 、卷積神經網路 、語義分割 、疾病檢測 |
外文關鍵詞: | tympanitis, acute otitis media, deep learning, convolutional neural network, semantic segmentation, disease testing |
相關次數: | 點閱:178 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
中耳炎是中耳部位受到細菌感染所導致的流行的上呼吸道感染炎症,好發於8歲以下、學齡前的兒童,及各年齡層。可區分為急性中耳炎、積液性中耳炎與慢性中耳炎。細菌感染時中耳內腔壓力會急劇增加而產生疼痛,有時會因此化膿腫脹嚴重,持續性發炎、流膿,如未治療會造成聽覺神經永久受損以及耳膜破損無法癒合。本研究旨在利用深度學習卷積神經網路與影像處理技術,建立分類系統,藉此來實現出全自動中耳炎輔助診斷系統,診斷預測中耳發炎情況、分類中耳炎的疾病,快速找出耳內鏡影像內的五種部位以及發炎種類,協助醫師臨床診斷之輔助。
本研究分為三部分,(一)利用深度學習卷積神經網路中的語義分割,使用三軍總醫院400張病患電腦斷層掃描資料,與醫師協同確實標註影像,訓練出全自動分割辨識中耳影像部位並擷取影像特徵參數,對未標註之資料進行驗證,Dice評估模型效能達到81.68%,且預測100組病患資料平均每一組僅需15秒。此外請三位三軍總醫院的耳鼻喉科醫師做交叉驗證,對50組不同病患之耳鏡影像進行測試集考核,比對的位置分成外耳道、耳膜、鬆弛部、錘骨柄、鼓環,平均準確度高達87.01%,顯示語義分割模型能精準分割辨識中耳區域輪廓,僅在少量錘骨炳邊界區域造成誤判。(二)建立人工神經網路架構,把選擇的特徵、交織訊息透過權重更迭傳遞給分類器辨識訓練,以達到分析中耳耳膜及其他區域之疾病屬性,分類準確率依混淆矩陣顯示平均達到92.65%。(三)建立分類系統,使遇到新一筆資料時,可判定屬於五類中的何種類別;依語意分割部位之順序為外耳道準確率為99.92%、耳膜98.51%、鬆弛部95.41%與鼓環95.41%以及錘骨炳部位為97.55%位,達成實現中耳炎輔助診斷系統。關鍵字:中耳炎、急性中耳炎、深度學習、卷積神經網路、語義分割、疾病檢測。
Tympanitis is a prevalent upper respiratory tract infection inflammation induced by bacterial infection of the middle ear region. It is prone to occur in persons under 8 years old, preschool children, and all age groups. It can be divided into acute otitis media, otitis media with effusion, and chronic otitis media. The middle ear cavity pressure rises rapidly when there is a bacterial infection, leading to pain, sometimes to a serious abscess. When persistent inflammation and suppuration are not treated, the auditory nerve will be permanently damaged, and the eardrum will be ruptured and cannot heal. This study aims to use deep learning convolutional neural networks and image processing technique to create a classification system. By implementing a full automatic auxiliary diagnosis system for tympanitis, diagnoses and predictions of inflammation of the middle ear can be conducted to classify the disease of tympanitis. Furthermore, the auxiliary diagnosis system can help in finding five regions and inflammation types in the otoscope image rapidly to assist doctors in clinical diagnoses.
This study was divided into three parts. (1) The semantic segmentation of deep learning convolutional neural network was used to positively label the 400 patients' otoscope data from Tri-Service General Hospital with collaboration from doctors. A full automatic segmentation and recognition program for the middle ear region in an image was trained. Afterwards, the image feature parameters were extracted, and the unlabeled data were employed for verification. The Dice evaluated model efficiency is 81.68%, and the prediction for each of the 100 groups of patient data only took 15 secs on average. Additionally, three otorhinolaryngologists from Tri-Service General Hospital were invited for cross-validation. A total of 50 groups of patients' otoscope images were used for test set examination. The positions adopted for the comparison included the external auditory canal, eardrum, pars flaccida, handle of malleus, and tympanic ring. The average accuracy was as high as 87.01%, meaning that the semantic segmentation model could accurately segment and identify the contour of the middle ear region. A small amount of misrecognition occurred in the boundary region of handle of malleus. (2) The artificial neural network architecture was created. The selected features and interlaced information were transferred to the classifier for recognition and training by weight alternation, so as to analyze the disease attributes of the middle eardrum and other regions. The average classification accuracy rate was 92.65% according to a confusion matrix. (3) A classification system was created so that a new piece of data can be identified as one of the five classes. The order of accuracy according to semantic segmentation is the external auditory canal region at 99.92%, eardrum at 98.51%, pars flaccida at 95.41%, tympanic ring at 95.41%, and handle of malleus at 97.55%. Hence, the goal of building an auxiliary diagnosis system for tympanitis diagnosis has been accomplished.
Keywords: tympanitis; acute otitis media; deep learning; convolutional neural network; semantic segmentation; disease testing
[1] Klein, J. O. (2015). Otitis externa, otitis media, and mastoiditis. Mandell, Douglas, and Bennett's Principles and Practice of Infectious Diseases, p. 767.
[2] Clipper, B. (2020). The influence of the COVID-19 pandemic on technology: Adoption in health care. Nurse Leader, Vol. 18(5), p. 500-503.
[3] Wang P. C., Chang YH, Chuang LJ, et al. (2011) Incidence and recurrence of acute otitis media in Taiwan's pediatric population. Clinics, p. 395-399.
[4] Parekh, J., Pankratov, M. M., Poe, D. et al. (1993). Medical requirements for design of endoscopes for otolaryngology. In Lasers in Otolaryngology, Dermatology, Tissue Welding, Vol. 1876, p. 42-48.
[5] Sharma, N., and Aggarwal, L. M. (2010). Automated medical image segmentation techniques. Journal of Medical Physics, Vol. 35, p. 3.
[6] Siddiq, S., and Joe, G. (2015). The diagnosis and management of acute otitis media: American academy of pediatrics guidelines 2013. Archives of Disease in Childhood-Education and Practice, Archives of Disease in Childhood-Education and Practice, Vol. 100(4), p. 193-197.
[7] Eavey, R. D., Stool, S. E., Peckham, G. J., et al. (1976). How to examine the ear of the neonate: careful monitoring can recognize problems early. Clinical Pediatrics, Vol. 15(4), p. 338-341.
[8] Liang Jiaguang. (2004). Diagnosis of otitis media. Taiwan Journal of Otolaryngology Head and Neck Surgery, Vol. 39, p. 9-11.
[9] You, P. , Chahine, S., and Husein, M. (2017). Improving learning and confidence through small group, structured otoscopy teaching: a prospective interventional study. Journal of Otolaryngology-Head and Neck Surgery, Vol. 46(1), p. 1-6.
[10] Chang, P. , and Pedler, K. (2005). Ear examination: a practical guide. Australian Journal of General Practice, Vol. 34(10), p. 857.
[11] Higgins, J., A., Raman, M., Beaumont, et al. (2019). A survey comparison of educational interventions for teaching pneumatic otoscopy to medical students. BMC Medical Education, Vol. 19, p. 1-7.
[12] Ponka, David, and Faisal Baddar (2013). Pneumatic otoscopy. Canadian Family Physician, Vol. 59, p. 962.
[13] Pichichero, M. E. (2000). Acute otitis media: Part 1. Improving diagnostic accuracy. American Family Physician, Vol. 61(7), p. 2051.
[14] Stefnisdóttir, S., Sikorska-Senoner, A. E., Ásgeirsson, E. I., et al. D. C. (2021). Advantages of metaheuristics for multi-dataset calibration of hydrological models. In EGU General Assembly Conference Abstracts, p. 1395.
[15] Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, Vol. 65, p. 386.
[16] Albawi, S., Mohammed, T. A., and Al-Zawi, S. (2017). Understanding of a convolutional neural network. 2017 International Conference on Engineering and Technology (ICET). p. 1-6.
[17] Snyder, W. E., and Qi, H. (2004). Machine vision. Vol. 1. Cambridge University Press.
[18] Gu, J., Wang, Z., Kuen, J., et al. (2018). Recent advances in convolutional neural networks. Pattern Recognition, Vol. 77, p. 354-377.
[19] Taha, B., and Shoufan, A. (2019). Machine learning-based drone detection and classification: State-of-the-art in research. IEEE Access, Vol. 7, p. 13669-13626.
[20] Venot, A., and Leclerc, V. (1984). Automated correction of patient motion and gray values prior to subtraction in digitized angiography. IEEE Transactions on Medical Imaging, Vol. 3, p. 179-186.
[21] Torre, V., and Poggio, T. A. (1986). On edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 2, p. 147-163.
[22] Zahn, C. T. (1971). Graph-theoretical methods for detecting and describing gestalt clusters, IEEE Transactions on Computers, Vol. 100, p. 68-86.
[23] Kaur, C., and Garg, U. (2021). Artificial intelligence techniques for cancer detection in medical image processing: A review. Materials Today: Proceedings.
[24] 陳慧珊(2010)。《中耳電腦斷層影像半自動重建技術》。中國醫藥大學臨床醫學研究所,台中,碩士論文。
[25] Fossel, E.T., Justine M. C. and Jan M. (1986). Detection of malignant tumors. New England Journal of Medicine, Vol. 315(22). p. 1369-1376.
[26] Monti, C. B., Codari, M., van Assen, M., et al. (2020). Machine learning and deep neural networks applications in computed tomography for coronary artery disease and myocardial perfusion. Journal of Thoracic Imaging, Vol. 35, p. 58-65.
[27] Sun, Y. (1989). Automated identification of vessel contours in coronary arteriograms by an adaptive tracking algorithm. IEEE Transactions on Medical Imaging, Vol. 8(1), p. 78-88.
[28] Byra, M., Styczynski, G., and Szmigielski, C.,(2018). Transfer learning with deep convolutional neural network for liver steatosis assessment in ultrasound images. International Journal of Computer Assisted Radiology and Surgery, Vol. 13(12), p. 1895-1903.
[29] Russakovsky O, Deng J, Su H, et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, Vol. 115, p. 211-252.
[30] Chen LC, Zhu Y, Papandreou G, et al. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European Conference on Computer Vision, p. 801-818.
[31] Yang, K., Yau, J. H., Fei F., et al. (2022). A study of face obfuscation in imagenet. In International Conference on Machine Learning. PMLR, p. 25313-25330.
[32] Ibekwe, T. S., Adeosun, A. A., and Nwaorgu, O. G. (2009). Quantitative analysis of tympanic membrane perforation: a simple and reliable method. The Journal of Laryngology and Otology, Vol. 123(1), p. 211.
[33] Comunello, E., von Wangenheim, A., Junior, V. H. et al. (2009). A computational method for the semi-automated quantitative analysis of tympanic membrane perforations and tympanosclerosis. Computers in Biology and Medicine, Vol. 39(10), p. 889-895.
[34] Chou, C. N., Shie, C. K., Chang, F. C., et al. (2019). Representation learning on large and small data. Big Data Anal. Large-Scale Multimed. Search. Wiley, Hoboken, p. 3-30.
[35] Shie, C. K., Chang, H. T., Fan, F. C., et al. (2014). A hybrid feature-based segmentation and classification system for the computer aided self-diagnosis of otitis media. 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology, p. 4655-4658.
[36] Uçar, M., Akyol, K., and Atila U. (2022). Classification of different tympanic membrane conditions using fused deep hypercolumn features and bidirectional LSTM. IRBM, Vol. 43, p. 187-197.
[37] Long, J., Shelhamer, E., and Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 3431-3440.
[38] Zhao, H., Shi, J., Qi, X., et al. (2017). Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2881-2890.
[39] Chen, L. C., Papandreou, G., Kokkinos, I., et al. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, p. 834-848.
[40] Khan, M. A., Kwon, S., Choo, J., et al. (2020). Automatic detection of tympanic membrane and middle ear infection from oto-endoscopic images via convolutional neural networks. Neural Networks, Vol. 126, p. 384-394.
[41] Ganiats, T. G., Lieberthal, A. S., Culpepper, L., et al. (2004). A joint clinical practice guideline for acute otitis media. American Family Physician, Vol. 69(11), p. 2537.
[42] Alberti, P. W. (2001). The anatomy and physiology of the ear and hearing. Occupational Exposure to Noise: Evaluation, p. 53-62.
[43] Swartz, J. D. (1984). Cholesteatomas of the middle ear. Diagnosis, etiology, and complications. Radiologic Clinics of North America, Vol. 22, p. 15-35.
[44] Moser, C., and Kleinplatz, P. J. (2002). Transvestic fetishism: Psychopathology or iatrogenic artifact. New Jersey Psychologist, Vol. 52, p. 16-17.
[45] Rovers, M. M., Schilder, A. G.,and Zielhuis (2004). Otitis media. The lancet, Vol. 363, p. 465-473.
[46] Chun, S. H., Lee, D. W., and Shin, J. K. (2010). A clinical study of traumatic perforation of tympanic membrane, Seoul, Korea: Department of Otolaryngology. Hanil Gen Hosp, Vol. 113, p. 679-86.
[47] Griffin Jr, W. L. (1979). A retrospective study of traumatic tympanic membrane perforations in a clinical practice. The Laryngoscope, Vol. 89, p. 261-282.
[48] Nomura, Y. (1982). A Needle Otoscope: An Instrument ofEndotoscopy of the Middle Ear. Acta oto-laryngologica, Vol. 93(6), p. 73-79.
[49] Bhattacharyya, S. (2011). A brief survey of color image preprocessing and segmentation techniques. Journal of Pattern Recognition Research, 1(1), p. 120-129.
[50] Perez, L., and Wang, J. (2017). The effectiveness of data augmentation in image classification using deep learning. ArXiv Preprint ArXiv, p. 1712.
[51] Russo, F. (2003). A method for estimation and filtering of Gaussian noise in images. IEEE Transactions on Instrumentation and Measurement, Vol. 52(4), p. 1148-1154.
[52] Hussain, Z., Gimenez, F., Yi, D., & Rubin, D. (2017). Differential data augmentation techniques for medical imaging classification tasks. In AMIA Annual Symposium Proceedings, Vol. 2017, p. 979
[53] Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Communications of the ACM, Vol. 60(6), p. 84-90.
[54] Alom, M. Z., Taha, T. M., Yakopcic, C., et al. (2019). A state-of-the-art survey on deep learning theory and architectures. Electronics, Vol. 8(3), p. 292.
[55] Wu, J. N. (2016). Compression of fully-connected layer in neural network by kronecker product. In 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI), p. 173-179
[56] Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21(2), p. 137-146.
[57] Kanopoulos, N., Vasanthavada, N., and Baker, R. L. (1988). Design of an image edge detection filter using the Sobel operator. IEEE Journal of solid-state circuits, Vol. 23(2), p. 358-367.
[58] Haralick, R. M., Shanmugam, K., and Dinstein, I. H. (1973). Textural features for image classification. IEEE Transactions on systems, man, and cybernetics, Vol. (6), p. 610-621.
[59] Russakovsky, O., Deng, J., Su, H., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, Vol. 115(3), .p. 211-252.
[60] Long, J., and Shelhamer, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 3431-3440.
[61] Zhao, H., Shi, J., Qi, X., et al. (2017). Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 2881-2890.
[62] He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 770-778.
[63] Sun, K., Xiao, B., Liu, D., et al. (2019). Deep high-resolution representation learning for human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, p. 5693-5703
[64] Stokes, M. (1996). A standard default color space for the internet-srgb. In Color and Imaging Conference, Vol 1, pp. 238-245.
[65] Muerle, J. L. (1970). Some thoughts on texture discrimination by computer, Picture Processing and Psychopictorics, p.371-379.
[66] Tuceryan, M., and Jain, A. K. (1993). Texture analysis. Handbook of Pattern Recognition and Computer Vision, p. 235-276.
[67] C Shahbahrami, A., Pham, T. A., and Bertels, K. (2012). Parallel implementation of Gray Level Co-occurrence Matrices and Haralick texture features on cell architecture. The Journal of Supercomputing, Vol. 59(3), 1455-1477.
[68] Haralick, R.M. and L. (1981). Watson, A facet model for image data. Computer Graphics and Image Processing, Vol. 15(2), p. 113-129.
[69] Mutiullah, A., Sabir, M., Naveed, S., and Bari, M. (2019). Lungs cancer detection using digital image processing techniques: a review. Mehran University Research Journal of Engineering and Technology, Vol. 38(2), p. 351-360.