研究生: |
蕭叡謙 Jui-Chien Hsiao |
---|---|
論文名稱: |
雙注意力機制設計結合轉換器之深層特徵於磁振影像上的腦腫瘤切割方法 mMRI Brain Tumor Segmentation with Dual Attention UNet and Transformer Deep Feature Extractor |
指導教授: |
郭景明
Jing-Ming Guo |
口試委員: |
王乃堅
Nai-Jian Wang 陳彥霖 Yen-Lin Chen 李宗南 Chung-Nan Lee 徐繼聖 Gee-Sern Jison Hsu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 83 |
中文關鍵詞: | 腦部腫瘤切割 、自注意力機制 、深度監督 |
外文關鍵詞: | Brain Tumor Segmentation, Transformer, Attention Mechanism |
相關次數: | 點閱:267 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文提出了一個以當前熱門主題Transformer及醫療影像常用的架構U-Net為基底的架構對多模態腦部腫瘤進行切割任務,並且在跳連結上加入了雙注意力機制以及在解碼器部分加入了深度監督以確保還原過程中的特徵最具有代表性並讓網路獲得更好的收斂效果。
本論文使用的資料集為BraTS2020,其中訓練集為369個樣本組成,驗證集 為125個樣本組成,對於驗證集官方並未提供相對應的切割標註,因此並不是作為模型挑選的資料集,而是需要上傳切割結果至CBICA IPP網站進行評分。
在前處理的部分,由於訓練樣本是由多台不同儀器取得,因此在像素值的
分布上為不同基準,故需要先對樣本進行Z-score Normalization將其標準化與切除樣本中極大極小值去除噪點。
在後處理的部分,由於訓練樣本中依據腫瘤嚴重程度分為HGG(High-grade Gliomas) 與 LGG(Low-grade gliomas),而其在腫瘤表現上有些許不同,最明顯的部分為部分LGG是沒有增強腫瘤的,這將會造成評分時的極端現象,dice分數非0即0,因此後處理在這個資料集是必須的,其中包含將面積小於10像素的聯通元件(connected component)去除與將面積小於500 像素的增強腫瘤更正為壞死區域或非增強腫瘤。
Medical image segmentation is a relatively tricky task compared to natural image semantic segmentation. The dataset size for medical image were usually small caused by the difficulty of data retrieval. In this paper, a UNet-like architecture is proposed and implemented on MICCAI BraTS 2020 dataset, which contains 369 training data and 125 validation data. The model performance will be evaluated based on online scoring of validation dataset provided by CBICA IPP.
The proposed UNet-like architecture incorporated Transformer encoder to strengthen high-level feature extraction. The shortage of Transformer on low-level feature was made up by the UNet decoder skip connection, on which we designed an attention mechanism to separate spatial and channel attention. We also utilized deep supervision technique to deeply supervise the decoder learning process for better feature extraction.
The proposed UNet-like architecture achieved dice score of 0.78345, 0.90402, 0.83887 on ET, WT, TC respectively given by CBICA IPP online evaluation, which is a competitive result and surpassed most of the similar approaches in MICCAI BraTS2020 Challenge.
[1] M.-C. Popescu, V. E. Balas, L. Perescu-Popescu, and N. Mastorakis, "Multilayer perceptron and neural networks," WSEAS Transactions on Circuits and Systems, vol. 8, no. 7, pp. 579-588, 2009.
[2] V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Icml, 2010.
[3] A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. icml, 2013, vol. 30, no. 1: Citeseer, p. 3.
[4] C. Gulcehre, M. Moczulski, M. Denil, and Y. Bengio, "Noisy activation functions," in International conference on machine learning, 2016: PMLR, pp. 3059-3068.
[5] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, "Fast and accurate deep network learning by exponential linear units (elus)," arXiv preprint arXiv:1511.07289, 2015.
[6] K. He, X. Zhang, S. Ren, and J. Sun, "Delving deep into rectifiers: Surpassing human-level performance on imagenet classification," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1026-1034.
[7] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[8] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012.
[9] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[10] M. Tan and Q. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," in International Conference on Machine Learning, 2019: PMLR, pp. 6105-6114.
[11] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.
[12] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, 2015: Springer, pp. 234-241.
[13] Z. Zhang, Q. Liu, and Y. Wang, "Road extraction by deep residual u-net," IEEE Geoscience and Remote Sensing Letters, vol. 15, no. 5, pp. 749-753, 2018.
[14] K. He, X. Zhang, S. Ren, and J. Sun, "Identity mappings in deep residual networks," in European conference on computer vision, 2016: Springer, pp. 630-645.
[15] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Semantic image segmentation with deep convolutional nets and fully connected crfs," arXiv preprint arXiv:1412.7062, 2014.
[16] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017.
[17] L.-C. Chen, G. Papandreou, F. Schroff, and H. Adam, "Rethinking atrous convolution for semantic image segmentation," arXiv preprint arXiv:1706.05587, 2017.
[18] P. Krähenbühl and V. Koltun, "Efficient inference in fully connected crfs with gaussian edge potentials," Advances in neural information processing systems, vol. 24, pp. 109-117, 2011.
[19] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961-2969.
[20] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, vol. 28, pp. 91-99, 2015.
[21] A. Vaswani et al., "Attention is all you need," in Advances in neural information processing systems, 2017, pp. 5998-6008.
[22] A. Dosovitskiy et al., "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020.
[23] J. Chen et al., "Transunet: Transformers make strong encoders for medical image segmentation," arXiv preprint arXiv:2102.04306, 2021.
[24] B. H. Menze et al., "The multimodal brain tumor image segmentation benchmark (BRATS)," IEEE transactions on medical imaging, vol. 34, no. 10, pp. 1993-2024, 2014.
[25] S. Bakas et al., "Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features," Scientific data, vol. 4, no. 1, pp. 1-13, 2017.
[26] S. Bakas et al., "Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the BRATS challenge," arXiv preprint arXiv:1811.02629, 2018.
[27] A. Myronenko, "3D MRI brain tumor segmentation using autoencoder regularization," in International MICCAI Brainlesion Workshop, 2018: Springer, pp. 311-320.
[28] W. Wang, C. Chen, M. Ding, J. Li, H. Yu, and S. Zha, "TransBTS: Multimodal Brain Tumor Segmentation Using Transformer," arXiv preprint arXiv:2103.04430, 2021.
[29] O. Oktay et al., "Attention u-net: Learning where to look for the pancreas," arXiv preprint arXiv:1804.03999, 2018.
[30] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 7132-7141.
[31] Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, "Unet++: A nested u-net architecture for medical image segmentation," in Deep learning in medical image analysis and multimodal learning for clinical decision support: Springer, 2018, pp. 3-11.
[32] L. Pei, A. Murat, and R. Colen, "Multimodal Brain Tumor Segmentation and Survival Prediction Using a 3D Self-ensemble ResUNet," in BrainLes@ MICCAI (1), 2020, pp. 367-375.
[33] R. Zsamboki, P. Takacs, and B. Deák-Karancsi, "Glioma Segmentation with 3D U-Net Backed with Energy-Based Post-Processing," in BrainLes@ MICCAI (2), 2020, pp. 104-117.
[34] C. Liu et al., "Brain Tumor Segmentation Network Using Attention-based Fusion and Spatial Relationship Constraint," arXiv preprint arXiv:2010.15647, 2020.
[35] W. Jun, H. Xu, and Z. Wang, "Brain Tumor Segmentation Using Dual-Path Attention U-Net in 3D MRI Images," in BrainLes@ MICCAI (1), 2020, pp. 183-193.
[36] Ma, Shiqiang, et al. "A Deep Supervision CNN Network for Brain Tumor Segmentation." BrainLes@ MICCAI (2). 2020.