簡易檢索 / 詳目顯示

研究生: 閻冠銘
Kuan-Ming Yen
論文名稱: 基於多策略融合的域適應語義分割模型
Multi-Strategies Integration Model for Domain Adaptation of Semantic Segmentation
指導教授: 陳永耀
Yung-Yao Chen
口試委員: 陳永耀
Yung-Yao Chen
吳晋賢
Chin-Hsien Wu
沈中安
Chung-An Shen
花凱龍
Kai-Lung Hua
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 42
中文關鍵詞: 語義分割域適應遷移學習一致性訓練
外文關鍵詞: Semantic Segmentation, Domain Adaptation, Transfer Learning, Consistency Training
相關次數: 點閱:147下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

無監督域適應和語義分割的結合,幫助了人們不論是在生活上或是工作上解決了許多的困難。例如在街景影像中,此項技術可以將不同的物件自動地分割出來,以利機器進行分類及辨識。然而,無奈的是,現實存在一些問題會影響域適應的效果。因此在此論文提出了一個搭配多種策略的無監督域適應語義分割模型。其中包含了針對類別不平衡、兩域的差距影響遷移學習、以及基於圖像空間一致性訓練的侷限性...等等的問題著手進行改善,以期能協助提升分類器的預測能力與其精準度。本文的方法在GTA5-to-Cityscapes實驗數據集測試當中,取得了不錯的分數。此外,與現今其他先進的作品相比之下,我們的模型具有快速收斂的優點。


The combination of unsupervised domain adaptation and semantic segmentation has helped people solve many difficulties in life and work. To cite an example, for the sake of facilitating the classification of machines, this technology can automatically segment different objects from images of urban scenes. However, unfortunately, some problems may bring negative influences on the performance of transfer learning. Hence, we propose an unsupervised domain adaptation of semantic segmentation model with multiple strategies in this paper. It addresses the issue of class-imbalance, excessive domain gaps, and the limitations of image consistency training…etc. Experiments prove that our methods are effective on “GTA5-to-Cityscapes” benchmark. Besides, we attain comparable performance with the existing brilliant works with advantages in convergence speed.

摘要 I Abstract II 目錄 III 第一章緒論 1 1.1前言 1 1.2研究動機 2 1.3論文貢獻 3 第二章相關文獻 4 2.1無監督域適應語義分割 4 2.2數據增強和一致性訓練 7 第三章方法 9 3.1整體概念與架構 9 3.2類別採樣處理 11 3.3風格融合 14 3.4一致性訓練 15 3.5損失函數 17 第四章實驗 19 4.1實驗資料集 19 4.2實驗環境與評估 20 4.3實驗數據與結果 21 第五章結論與展望 23 5.1結論 23 5.2未來展望 24 參考文獻 25

[1]Wang, Lin, Wonjune Cho, and Kuk-Jin Yoon, “Deceiving image-to-image translation networks for autonomous driving with adversarial perturbations,” IEEE Robotics and Automation Letters, pp. 1421-1428, 2020.

[2]Truong, Thanh-Dat, et al, “CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars,” arXiv preprint arXiv:2212.00621, 2022.

[3]Biasetton, Matteo, et al, “Unsupervised domain adaptation for semantic segmentation of urban scenes,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 0-0, 2019.

[4]Chen, Jie, et al, “Unsupervised Domain Adaptation for Semantic Segmentation of High-Resolution Remote Sensing Imagery Driven by Category-Certainty Attention,” IEEE Transactions on Geoscience and Remote Sensing, pp. 1-15, 2022.

[5]Tasar, Onur, et al, “ColorMapGAN: Unsupervised domain adaptation for semantic segmentation using color mapping generative adversarial networks,” IEEE Transactions on Geoscience and Remote Sensing, pp. 7178-7193, 2020.

[6]Perone, Christian S., et al, “Unsupervised domain adaptation for medical imaging segmentation with self-ensembling,” NeuroImage, pp. 1-11, 2019.

[7]Zhang, Zuyu, Yan Li, and Byeong‐Seok Shin, “C2‐GAN: Content‐consistent generative adversarial networks for unsupervised domain adaptation in medical image segmentation,” Medical Physics, pp. 6491-6504, 2022.

[8]Panagiotakopoulos, Theodoros, et al, “Online Domain Adaptation for Semantic Segmentation in Ever-Changing Conditions,” European Conference on Computer Vision. Springer, Cham, pp. 128-146, 2022.

[9]Luo, Yawei, et al, “Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2507-2516, 2019.

[10]Guo, Jiaxian, et al, “Alleviating Semantics Distortion in Unsupervised Low-Level Image-to-Image Translation via Structure Consistency Constraint,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18249-18259, 2022.

[11]Lai, Xuguang, Xiuxiu Bai, and Yongqiang Hao, “Unsupervised Generative Adversarial Networks with Cross-model Weight Transfer Mechanism for Image-to-image Translation,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1814-1822, 2021.

[12]Ko, Minsu, et al, “Self-Supervised Dense Consistency Regularization for Image-to-Image Translation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18301-18310, 2022.

[13]Likas, Aristidis, Nikos Vlassis, and Jakob J. Verbeek, “The global k-means clustering algorithm,” Pattern recognition, pp. 451-461, 2003.

[14]Deng, Jia, et al, “Imagenet: A large-scale hierarchical image database,” 2009 IEEE conference on computer vision and pattern recognition, pp. 248-255, 2009.

[15]Tsai, Yi-Hsuan, et al, “Learning to adapt structured output space for semantic segmentation,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7472-7481, 2018.

[16]Chen, Yuhua, et al, “Learning semantic segmentation from synthetic data: A geometrically guided input-output adaptation approach,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1841-1850, 2019.

[17]Du, Liang, et al, “Ssf-dan: Separated semantic feature based domain adaptation network for semantic segmentation,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 982-991, 2019.

[18]Huang, Jiaxing, et al, “Multi-level adversarial network for domain adaptive semantic segmentation,” Pattern Recognition, p. 108384, 2022.

[19]Murez, Zak, et al, “Image to image translation for domain adaptation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4500-4509, 2018.

[20]Feng, Jie, et al, “Generative adversarial networks based on collaborative learning and attention mechanism for hyperspectral image classification,” Remote Sensing, pp. 1149–1156, 2020.

[21]Li, Ruihuang, et al, “Class-Balanced Pixel-Level Self-Labeling for Domain Adaptive Semantic Segmentation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11593–11603, 2022.

[22]Choi, Jaehoon, Taekyung Kim, and Changick Kim, “Self-ensembling with gan-based data augmentation for domain adaptation in semantic segmentation,” Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6830-6840, 2019.

[23]Mei, Ke, et al, “Instance adaptive self-training for unsupervised domain adaptation,” European conference on computer vision, pp. 415-430, 2020.

[24]Prabhu, Viraj Uday, et al, “Augmentation Consistency-guided Self-training for Source-free Domain Adaptive Semantic Segmentation,” NeurIPS 2022 Workshop on Distribution Shifts: Connecting Methods and Applications, 2022.

[25]Jan, Tabassum Gull, Surinder Singh Khurana, and Munish Kumar, “Semi-supervised labeling: a proposed methodology for labeling the twitter datasets,” Multimedia Tools and Applications, pp. 7669-7683, 2022.

[26]Olsson, Viktor, et al, “Classmix: Segmentation-based data augmentation for semi-supervised learning,” Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 1369-1378, 2021.

[27]Chen, Ting, et al, “A simple framework for contrastive learning of visual representations,” International conference on machine learning, pp. 1597-1607, 2020.

[28]Cubuk, Ekin D., et al, “Autoaugment: Learning augmentation strategies from data,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 113-123, 2019.

[29]Ho, Daniel, et al, “Population based augmentation: Efficient learning of augmentation policy schedules,” International Conference on Machine Learning, pp. 2732-2741, 2019.

[30]Xu, Jiaolong, Liang Xiao, and Antonio M. López, “Self-supervised domain adaptation for computer vision tasks,” IEEE Access, pp.156694-156706, 2019.

[31]Xiao, Liang, et al, “Self-supervised domain adaptation with consistency training,” 2020 25th International Conference on Pattern Recognition, pp.6874-6880, 2021.

[32]Koh, Chih-Yuan, et al, “Sound event detection by consistency training and pseudo-labeling with feature-pyramid convolutional recurrent neural networks,” ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.376-380, 2021.

[33]Addepalli, Sravanti, et al, “Towards achieving adversarial robustness by enforcing feature consistency across bit planes,” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.1020-1029, 2020.

[34]Toldo, Marco, et al, “Unsupervised domain adaptation for mobile semantic segmentation based on cycle consistency and feature alignment,” Image and Vision Computing, pp.627-636, 2020.

[35]Zhang, Lei, and David Zhang, “Visual understanding via multi-feature shared learning with global consistency,” IEEE Transactions on Multimedia, pp.247-259, 2015.

[36]Li, Yijun, et al, “A closed-form solution to photorealistic image stylization,” Proceedings of the European Conference on Computer Vision (ECCV), pp. 453-468, 2018.

[37]Cubuk, Ekin D., et al, “Randaugment: Practical automated data augmentation with a reduced search space,” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp. 702-703, 2020.

[38]Nassar, Islam, et al, “All labels are not created equal: Enhancing semi-supervision via label grouping and co-training,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7241-7250, 2021.

[39]Richter, Stephan R., et al, “Playing for data: Ground truth from computer games,” European conference on computer vision, pp. 102-118, 2016.

[40]Lee, Chen-Yu, et al, “Sliced wasserstein discrepancy for unsupervised domain adaptation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10285-10295, 2019.

[41]Cordts, Marius, et al, “The cityscapes dataset for semantic urban scene understanding,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213-3223, 2016.

[42]He, Kaiming, et al, “Deep residual learning for image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.

[43]Paszke, Adam, et al, “Pytorch: An imperative style, high-performance deep learning library,” Advances in neural information processing systems, pp. 8026–8037, 2019.

[44]Ruder, Sebastian, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747, pp. 01-14, 2016.

[45]Melas-Kyriazi, Luke, and Arjun K. Manrai, “Pixmatch: Unsupervised domain adaptation via pixelwise consistency training,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12435-12445, 2021.

[46]Yang, Yanchao, and Stefano Soatto, “Fda: Fourier domain adaptation for semantic segmentation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4085-4095, 2020.

[47]Pan, Fei, et al, “Unsupervised intra-domain adaptation for semantic segmentation through self-supervision,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3764-3773, 2020.

[48]Li, Yunsheng, Lu Yuan, and Nuno Vasconcelos, “Bidirectional learning for domain adaptation of semantic segmentation,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6936-6945, 2019.

無法下載圖示 全文公開日期 2025/01/16 (校內網路)
全文公開日期 2025/01/16 (校外網路)
全文公開日期 2025/01/16 (國家圖書館:臺灣博碩士論文系統)
QR CODE