簡易檢索 / 詳目顯示

研究生: 王泰淞
Tai-Sung Wang
論文名稱: 應用對比學習之人體器官語意分割方法
Semantic Segmentation on Human Organs by Contrastive Learning
指導教授: 方文賢
Wen-Hsien Fang
陳郁堂
Yie-Tarng Chen
口試委員: 方文賢
Wen-Hsien Fang
陳郁堂
Yie-Tarng Chen
賴坤財
Kuen-Tsair Lay
阮聖彰
Shanq-Jang Ruan
丘建青
Chien-Ching Chiu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 56
中文關鍵詞: 語意分割醫療影像對比學習困難負樣本挖掘
外文關鍵詞: Semantic Segmentation, Medical Image, Contrastive Learning, Hard Negative Mining
相關次數: 點閱:225下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

醫學圖像的語義分割可以幫助醫生進行診斷,而被引起了關注。為了提高物體邊緣和困難類別的預測準確性,在本論文中,我們考慮一種新的架構,它由兩部分組成:TransUNet 和表示網絡。 TransUNet 是一種U 形的網絡架構,它包括一個帶有跳躍連接的對稱編碼器與解碼器網絡來增強保留低階特徵。然而,由於稀有物體經常被錯誤預測且輪廓模糊。為了減少這些問題,我們使用帶有表示網絡的對比學習來幫助網絡學習和預測那些模糊的輪廓像素和稀有物體。此外,我們對稀有物體使用困難負鍵挖掘(Hard Negative Mining)方法來減少類別不平衡影響。我們的困難負鍵挖掘策略是添加一個權重來增加那些稀有類別的對比學習比率。我們的實驗顯示,我們的方法在兩個常用醫學數據集上都能實現卓越的準確性。


Semantic segmentation of medical images has attracted much attention recently due to its capability to assist doctors in making a diagnosis. In order to increase the detection accuracy of object boundaries and hard categories, in this thesis we consider a new architecture is comprise of two parts: TransUNet and representation network. TransUNet is a U-shape network architecture which includes a symmetric encoder-decoder network with skip-connections to enhance the retained features in details. However, small objects are often mislabeled and the contours are blurry. To lessen the impact, we employ the contrastive learning with a representation network to help the network learn and predict those blurry contours pixels and small objects. Furthermore, we use a hard negative mining on those categories which are comprised of small objects to circumvent the class imbalance problem. Our mining strategy is to add a weighting to increase the contrastive learning rate of those categories of small size. Conducted simulations showcase that our method can achieve superior performance compared with previous works on two commonly used medical datasets.

摘要 i Abstract ii Acknowledgment iii Tableofcontents iv ListofFigures vii ListofTables ix ListofAcronyms x 1 Introduction 1 2 RelatedWork 4 2.1 SemanticSegmentation 4 2.2 DataAugmentation 5 2.3 Transformer 5 2.4 ContrastiveLearning 6 2.5 Summary 7 3 ProposedMethod 8 3.1 ProposedArchitecture 8 3.2 TransUNet 9 3.2.1 EncoderPart 10 3.2.2 Decoderpart 10 3.3 RepresentationNetwork 11 3.4 LossFunction 11 3.4.1 CrossEntropyLoss 12 3.4.2 DiceLoss 12 3.4.3 ReCoLoss 13 3.5 HardNegativeMining 15 3.6 Summary 17 4 Experimental Results and Discussion 18 4.1 Medical Computed Tomography Dataset 18 4.1.1 Synapse Multi-organ Segmentation Dataset 18 4.1.2 Automated Cardiac Diagnosis Challenge 20 4.2 Experimental Setup 21 4.2.1 DataAugmentation 21 4.2.2 EvaluationMetrics 24 4.3 ExperimentalResults 25 4.3.1 Synapse Multi-organ Segmentation Dataset 26 4.3.2 Automated Cardiac Diagnosis Challenge 32 4.4 Failure Case and Error Analysis 35 4.5 Summary 39 5 Conclusion and Future Works 40 5.1 Conclusion 40 5.2 FutureWorks 40 References 41

[1] J. Chen, Y. Lu, Q. Yu, X. Luo, E. Adeli, Y. Wang, L. Lu, A. L. Yuille, and Y. Zhou, “Transunet: Transformers make strong encoders for medical image segmentation,” arXiv preprint arXiv:2102.04306, 2021.
[2] S. Mannor, D. Peleg, and R. Rubinstein, “The cross entropy method for clas- sification,” in Proceedings of the 22nd international conference on Machine learning, pp. 561–568, 2005.
[3] F. Milletari, N. Navab, and S.-A. Ahmadi, “V-net: Fully convolutional neu- ral networks for volumetric medical image segmentation,” in 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571, 2016.
[4] S. Liu, S. Zhi, E. Johns, and A. J. Davison, “Bootstrapping semantic seg- mentation with regional contrast,” arXiv preprint arXiv:2104.04465, 2021.
[5] https://www.synapse.org/!Synapse:syn3193805/wiki/217789.
[6] https://www.creatis.insa lyon.fr/Challenge/acdc/.
[7] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer Assisted Intervention, pp. 234–241, 2015.
[8] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in Neural Information Processing Systems, vol. 30, 2017.
[9] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440, 2015.
[10] G. French, S. Laine, T. Aila, M. Mackiewicz, and G. Finlayson, “Semi- supervised semantic segmentation needs strong, varied perturbations,” arXiv preprint arXiv:1906.01916, 2019.
[11] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Un- terthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
[12] P. O. O Pinheiro, A. Almahairi, R. Benmalek, F. Golemo, and A. C. Courville, “Unsupervised learning of dense visual representations,” Advances in Neural Information Processing Systems, vol. 33, pp. 4489–4500, 2020.
[13] W. Wang, T. Zhou, F. Yu, J. Dai, E. Konukoglu, and L. Van Gool, “Ex- ploring cross-image pixel contrast for semantic segmentation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7303– 7313, 2021.
[14] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Doll ́ar, “Focal loss for dense object detection,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2980–2988, 2017.
[15] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, 2009.
[16] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.
[17] K. H. Zou, S. K. Warfield, A. Bharatha, C. M. Tempany, M. R. Kaus, S. J. Haker, W. M. Wells III, F. A. Jolesz, and R. Kikinis, “Statistical validation of image segmentation quality based on a spatial overlap index1: scientific reports,” Academic Radiology, vol. 11, no. 2, pp. 178–189, 2004.
[18] S. Fu, Y. Lu, Y. Wang, Y. Zhou, W. Shen, E. Fishman, and A. Yuille, “Domain adaptive relational reasoning for 3d multi-organ segmentation,” in International Conference on Medical Image Computing and Computer Assisted Intervention, pp. 656–666, 2020.
[19] J. Schlemper, O. Oktay, M. Schaap, M. Heinrich, B. Kainz, B. Glocker, and D. Rueckert, “Attention gated networks: Learning to leverage salient regions in medical images,” Medical Image Analysis, vol. 53, pp. 197–207, 2019.

無法下載圖示 全文公開日期 2024/08/23 (校內網路)
全文公開日期 2024/08/23 (校外網路)
全文公開日期 2024/08/23 (國家圖書館:臺灣博碩士論文系統)
QR CODE