研究生: |
林宇宏 Yu-Hung Lin |
---|---|
論文名稱: |
擴增設計結合重新關注機制轉換器於多模態磁振影像上的腦腫瘤切割方法 mMRI Brain Tumor Segmentation with Extend 3D-UNet and Re-Attention Transformer |
指導教授: |
郭景明
Jing-Ming Guo |
口試委員: |
楊士萱
Shin-Hsuan Yang 王乃堅 Nai-Jian Wang 賴文能 Wen-Nung Lie 楊家輝 Jar-Ferr Yang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 73 |
中文關鍵詞: | 腦部腫瘤切割 、轉換器 、自注意力更新機制 、深度監督 |
外文關鍵詞: | RSNA-ASNR-MICCAI BraTS 2021, Transformer, Brain Tumor Segmentation, Re-Attention Mechanism |
相關次數: | 點閱:223 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文以當前影像辨識的熱門主題Transformer以及於醫療影像上進行切割時之常用架構U-Net作為基底,針對多模態腦部腫瘤進行影像切割任務。在U-Net中進行結構調整並在轉換器中加入自注意力更新機制配合解碼器部分的深度監督機制以確保能保存最具代表性之特徵並能使網路獲得更好的收斂效果。
本論文使用的資料集為BraTS 2021,其中訓練集為1251組樣本組成,而驗證集為219組樣本組成。在訓練集方面,將資料集切割成8:2的方式進行訓練及驗證;在驗證集方面,由於官方並未提供對應的切割真實標註,因此其並非作訓練模型時驗證挑選模型的用途,而是作為BraTS Challenge的評估標準,如同測試集般於Sage Bionetworks Synapse網站進行評估。
於前處理部分,為了因應訓練樣本由多台不同儀器及不同醫療機構所提供而產生的像素值分布基準不同問題,進行Z-score Normalization將樣本標準化。
基於訓練樣本中有兩種腫瘤型態HGG(High-grade Gliomas) 與 LGG(Low-grade gliomas),其以嚴重程度分類,而相較於HGG,LGG由於嚴重程度較低,因而有部分樣本是沒有增強腫瘤的部分,這將會造成評估時的極端現象,dice分數非1即0,故在此以兩個方式進行後處理,首先將體積小於10像素的聯通元件(connected component)去除後再將體積小於250 像素的增強腫瘤更正為壞死區域及非增強腫瘤。
Medical image segmentation is a relatively tricky task compared to natural image semantic segmentation. The dataset size for medical image were usually small caused by the difficulty of data retrieval. In this thesis, a UNet-like architecture is proposed, and implemented on RSNA-ASNR-MICCAI BraTS 2021 dataset which contains 1251 and 219 training and validation images, respectively. The model is evaluated based on the online scoring of validation dataset provided by Sage Bionetworks.
The proposed UNet-like architecture incorporated with Transformer encoder to strengthen high-level feature extraction with a re-attention mechanism to replace the self-attention in Transformer. The shortage of Transformer on low-level feature was enhanced by the UNet decoder skip connection. The additional extended layers and channels exhibit promising results based on the experimental analysis. Moreover, the deep supervision is also employed to improve feature extraction for the decoder learning.
The proposed architecture achieved dice score of 0.84438, 0.91832, 0.86615 on ET, WT, TC, and Hausdorff95 score of 16.21622, 4.47052, 6.71641 on ET, WT, TC, respectively, given by the synapse online evaluation, which is a competitive result and outperformed most of the similar approaches in RSNA-ASNR-MICCAI BraTS2021 Challenge.
[1] M.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, "Multilayer perceptron and neural networks," WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579-588, 2009.
[2] V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Icml, 2010.
[3] A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. icml, 2013, vol. 30, no. 1: Citeseer, p. 3.
[4] D. Hendrycks and K. Gimpel, “Gaussian Error Linear Units (GELUs),” arXiv preprint arXiv:1606.08415, 2016.
[5] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, "Fast and accurate deep network learning by exponential linear units (elus)," arXiv preprint arXiv:1511.07289, 2015.
[6] K. He, X. Zhang, S. Ren, and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026-1034.
[7] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012.
[9] K. Simonyan and A. Zisserman, " Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv preprint arXiv:1409.1556, 2014.
[10] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[11] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, " MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861, 2017.
[12] M. Tan and Q. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," arXiv preprint arXiv:1905.11946, 2017.
[13] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.
[14] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, 2015: Springer, pp. 234-241.
[15] Z. Zhang, Q. Liu, and Y. Wang, "Road extraction by deep residual u-net," IEEE Geoscience and Remote Sensing Letters, vol. 15, no. 5, pp. 749-753, 2018.
[16] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Semantic image segmentation with deep convolutional nets and fully connected crfs," arXiv preprint arXiv:1412.7062, 2014.
[17] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017.
[18] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, "Rethinking atrous convolution for semantic image segmentation," arXiv preprint arXiv:1706.05587, 2017.
[19] P. Krähenbühl and V. Koltun, "Efficient inference in fully connected crfs with gaussian edge potentials," Advances in neural information processing systems, vol. 24, pp. 109-117, 2011.
[20] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969.
[21] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, vol. 28, pp. 91-99, 2015.
[22] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser and I. Polosukhin, "Attention is all you need," in Advances in neural information processing systems, 2017, pp. 5998-6008.
[23] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit and N. Houlsby, "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020.
[24] D. Zhou, B. Kang, X. Jin, L. Yang, X. Lian, Z. Jiang, Q. Hou, and J. Feng, " DeepViT: Towards Deeper Vision Transformer, " arXiv preprint arXiv:2103.11886, 2021.
[25] J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille and Y. Zhou, "Transunet: Transformers make strong encoders for medical image segmentation," arXiv preprint arXiv:2102.04306, 2021.
[26] U.Baid, et al., “The RSNA-ASNR-MICCAI BraTS 2021 Benchmark on Brain Tumor Segmentation and Radiogenomic Classification,” arXiv preprint arXiv:2107.02314, 2021.
[27] B. H. Menze, A. Jakab, S. Bauer, J. Kalpathy-Cramer, K. Farahani, J. Kirby, et al., "The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS)," IEEE Transactions on Medical Imaging, vol. 34, no. 10, pp. 1993-2024, 2014.
[28] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. Kirby, J. B. Freymann, K. Fatahani and C. Davatzikos, “Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features,” Scientific data, vol. 4, no. 1, pp. 1-13, 2017.
[29] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. Kirby, J. B. Freymann, K. Fatahani and C. Davatzikos, "Segmentation Labels and Radiomic Features for the Pre-operative Scans of the TCGA-GBM collection," The Cancer Imaging Archive, vol. 286, 2017.
[30] S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. Kirby, J. B. Freymann, K. Fatahani and C. Davatzikos, "Segmentation Labels and Radiomic Features for the Pre-operative Scans of the TCGA-LGG collection," The Cancer Imaging Archive, Nat Sci Data, vol. 4, p. 170117, 2017.
[31] A. Myronenko, "3D MRI brain tumor segmentation using autoencoder regularization," in International MICCAI Brainlesion Workshop, 2018: Springer, pp. 311-320.
[32] W. Wang, C. Chen, M. Ding, J. Li, H. Yu, and S. Zha, "TransBTS: Multimodal Brain Tumor Segmentation Using Transformer," arXiv preprint arXiv:2103.04430, 2021.
[33] 蕭叡謙。「雙注意力機制設計結合轉換器之深層特徵於磁振影像上的腦腫瘤切割方法」。碩士論文,國立臺灣科技大學電機工程系,2021。
[34] O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y Hammerla, B. Kainz, B. Glocker and D. Rueckert, "Attention u-net: Learning where to look for the pancreas," arXiv preprint arXiv:1804.03999, 2018.
[35] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132-7141.
[36] C. Lee, S. Xie, P. Gallagher, Z. Zhang and Z. Tu, "Deeply-supervised nets," arXiv preprint arXiv:1409.5185, 2014.
[37] Q. Jia and H. Shu, "BiTr-Unet: a CNN-Transformer Combined Network for MRI Brain Tumor Segmentation," arXiv preprint arXiv:2109.12271, 2021.
[38] S.T. Bukhari and H. Mohy-ud-Din, "E1D3 U-Net for Brain Tumor Segmentation: Submission to the RSNA-ASNR-MICCAI BraTS 2021 Challenge," arXiv preprint arXiv:2110.02519, 2021.
[39] D. Liu, N. Sheng, T. He, W. Wang, J. Zhang and J. Zhang, “SGEResU-Net for brain tumor segmentation,” Mathematical biosciences and engineering : MBE vol. 19,6, 2022.
[40] R. A. Zeineldin, M. E. Karar, F. Mathis-Ullrich and O. Burgert, "Ensemble CNN Networks for GBM Tumors Segmentation using Multi-parametric MRI," arXiv preprint arXiv:2112.06554, 2021.