在特徵空間運用保守擴散做類別傳遞處理非監督式遷移學習

簡易檢索 / 詳目顯示

回結果列表

研究生：	梁哲魁 Che-Kuei Liang
論文名稱：	在特徵空間運用保守擴散做類別傳遞處理非監督式遷移學習 Labeling Propagation with Conservative Diffusion in Feature Space for Unsupervised Domain Adaptation
指導教授：	鮑興國 Hsing-Kuo Pao
口試委員:	劉庭祿 Tyng-Luh Liu 項天瑞 Tien-Ruey Hsiang 陳冠宇 Kuan-Yu Chen
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2020
畢業學年度：	109
語文別：	英文
論文頁數：	60
中文關鍵詞：	遷移學習、域適應、標籤傳播、深度學習、課程式學習、偽標籤
外文關鍵詞：	domain adaptation, label propagation, diffusion, deep learning, curriculum learning, pseudo-label
相關次數：	點閱：272 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

我們提出了應用於非監督遷移學習的方法，此方法基於特徵空間擴散和保守的標記策略。遷移學習近年來在許多應用上引發了關注。深度學習的成功故事使我們要求從源域中的有效建模進而擴展於目標域的有效建模。正如許多研究人員所指出的那樣，我們不必將源域的成功故事完美的復製到目標域上。如果我們在目標域中沒有太多信息，那麼這個問題將更具挑戰性。我們試圖通過在特徵空間上進行保守的擴散來解決問題。

在特徵空間上基於擴散的偽標籤比起在輸入空間上進行偽標籤被認為更合適，得以提高對無標籤數據的標籤質量。通過對數據的流形假設，我們希望從深度模型網路獲得的特徵空間可以很好地表示出樣本間的關係。為了盡可能的給予未標記的數據準確的偽標籤，我們在擴散過程中採取保守策略。也就是說，當我們對偽標籤有足夠的信心時才標記未標記的數據。在我們獲取了更多的標籤數據進行訓練（帶有一些偽標籤信息）之後，可以執行另一次偽標記操作，通過上述的過程以期待獲得更健全的模型。

從各種非監督域遷移學習的實驗中顯示，所提出的方法在多個遷移學習的基準上確實比許多最先進的方法更有效。

We propose a method for unsupervised domain adaptation based on feature space diffusion with a conservative labeling strategy. The topic of domain adaptation has drawn attention on many applications in recent years. The successful story of deep learning makes us ask for an extension when we move from the effective modeling in the source domain to the effective modeling in the target domain. As pointed out by numerous researchers, it is not necessary that we can replicate the successful story from the source domains to the target domains. The problem is even more challenging if we do not have much information in the target domain. We attempt to solve the problem by a conservative diffusion on the feature space.

The diffusion-based pseudo-labeling on the feature space is considered more appropriate than pseudo-labeling on the input space to enhance the labeling quality on unlabeled data. With the manifold assumption on the input data, we hope the feature space that is obtained from the deep learning procedure can represent well about the relationship between different samples. To ensure as accurate as possible pseudo-labeling on the unlabeled data, we adopt a conservative strategy in the diffusion procedure. That is, we label the unlabeled data when we have enough confidence on doing so. After that, another run of the labeling can be done after we have acquired more labeled data for training (with some pseudo-label information) and a more robust model can be expected based on such procedure.

The experiments on unsupervised domain adaptation show that the proposed method is indeed more effective than many state-of-the-art approaches in several benchmarks on the topic of domain adaptation learning.

Introduction
1 Our contribution
2 Thesis outline
Related Work
1 Domain Adaptation
2 Semi-Supervised Learning in UDA
3 Curriculum Learning
Methodology
1 Problem Definition
2 Overview of Proposed Method
3 Diffusion-based Pseudo-Labeling
4 Regularize with Mixup-Feature
5 Easy Samples Selection and Adaptation
6 Curriculum-based Iterative Training
Experiment
1 Datasets
2 Implementation Details
3 Comparison with State-Of-The-Art
4 Further Empirical Analysis
4.1  Pilot study
4.2  Ablation Study
4.3  Label Propagation Parameters Analysis
4.4  Pseudo-Labeling Methods Comparison
4.5  Curriculum Schedule Comparison
4.6  Label Propagation Versus Easy Target Adaptation
4.7  Feature Visualization
4.8  Semi-Supervised Methods Comparison
Conclusions
                                

[1] J. Quionero-Candela, M. Sugiyama, A. Schwaighofer, and N. D. Lawrence, Dataset shift in machine learning. The MIT Press, 2009.
[2] S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345–1359, 2009.
[3] E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, “Deep domain confusion: Maximizing for domain invariance,” arXiv preprint arXiv:1412.3474, 2014.
[4] M. Long, Y. Cao, J. Wang, and M. Jordan, “Learning transferable features with deep adaptation net-works,” in International conference on machine learning, pp. 97–105, 2015.
[5] E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell, “Adversarial discriminative domain adaptation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7167–7176,2017.
[6] Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky, “Domain-adversarial training of neural networks,” The Journal of Machine learning research, vol. 17, no. 1, pp. 2096–2030, 2016.
[7] Y. Ganin and V. Lempitsky, “Unsupervised domain adaptation by backpropagation,” in International conference on machine learning, pp. 1180–1189, 2015.
[8] K. Saito, Y. Ushiku, and T. Harada, “Asymmetric tri-training for unsupervised domain adaptation,” arXiv preprint arXiv:1702.08400, 2017.
[9] S. Xie, Z. Zheng, L. Chen, and C. Chen, “Learning semantic representations for unsupervised domain adaptation,” in International Conference on Machine Learning, pp. 5423–5432, 2018.
[10] J. Choi, M. Jeong, T. Kim, and C. Kim, “Pseudo-labeling curriculum for unsupervised domain adaptation,” arXiv preprint arXiv:1908.00262, 2019.
[11] C. Chen, W. Xie, W. Huang, Y. Rong, X. Ding, Y. Huang, T. Xu, and J. Huang, “Progressive feature alignment for unsupervised domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 627–636, 2019.
[12] D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” in Workshop on challenges in representation learning, ICML, vol. 3, 2013.
[13] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013.
[14] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.43
[15] B. Sun and K. Saenko, “Deep coral: Correlation alignment for deep domain adaptation,” in European conference on computer vision, pp. 443–450, Springer, 2016.
[16] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125–1134, 2017.
[17] K. Bousmalis, N. Silberman, D. Dohan, D. Erhan, and D. Krishnan, “Unsupervised pixel-level domain adaptation with generative adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3722–3731, 2017.
[18] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680, 2014.
[19] M. Wang and W. Deng, “Deep visual domain adaptation: A survey,” Neurocomputing, vol. 312, pp. 135–153, 2018.
[20] O. Chapelle, B. Scholkopf, and A. Zien, “Semi-supervised learning (Chapelle, o. et al., eds.; 2006) [book reviews],” IEEE Transactions on Neural Networks, vol. 20, no. 3, pp. 542–542, 2009.
[21] G. French, M. Mackiewicz, and M. Fisher, “Self-ensembling for visual domain adaptation,” arXiv preprint arXiv:1706.05208, 2017.
[22] R. Shu, H. H. Bui, H. Narui, and S. Ermon, “A dirt-t approach to unsupervised domain adaptation,” arXiv preprint arXiv:1802.08735, 2018.
[23] X. Mao, Y. Ma, Z. Yang, Y. Chen, and Q. Li, “Virtual mixup training for unsupervised domain adaptation,” arXiv preprint arXiv:1905.04215, 2019.
[24] A. Tarvainen and H. Valpola, “Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results,” in Advances in neural information processing systems, pp. 1195–1204, 2017.
[25] O. Chapelle and A. Zien, “Semi-supervised classification by low density separation.,” in AISTATS, vol. 2005, pp. 57–64, Citeseer, 2005.
[26] T. Miyato, S.-i. Maeda, M. Koyama, and S. Ishii, “Virtual adversarial training: a regularization method for supervised and semi-supervised learning,” IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 8, pp. 1979–1993, 2018.
[27] V. Verma, A. Lamb, J. Kannala, Y. Bengio, and D. Lopez-Paz, “Interpolation consistency training for semi-supervised learning,” arXiv preprint arXiv:1903.03825, 2019.
[28] Y. Bengio, J. Louradour, R. Collobert, and J. Weston, “Curriculum learning,” in Proceedings of the 26th annual international conference on machine learning, pp. 41–48, 2009.44
[29] D. Weinshall, G. Cohen, and D. Amir, “Curriculum learning by transfer learning: Theory and experiments with deep networks,” arXiv preprint arXiv:1802.03796, 2018.
[30] L. Jiang, Z. Zhou, T. Leung, L.-J. Li, and L. Fei-Fei, “Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels,” in International Conference on Machine Learning, pp. 2304–2313, 2018.
[31] S. Guo, W. Huang, H. Zhang, C. Zhuang, D. Dong, M. R. Scott, and D. Huang, “Curriculumnet: Weakly supervised learning from large-scale web images,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 135–150, 2018.
[32] A. Iscen, G. Tolias, Y. Avrithis, and O. Chum, “Label propagation for deep semi-supervised learning,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5070–5079,2019.
[33] D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf, “Learning with local and global consistency,” in Advances in neural information processing systems, pp. 321–328, 2004.
[34] X. Zhu, Semi-supervised learning with graphs. PhD thesis, 2005.
[35] L. Grady, “Random walks for image segmentation,” IEEE transactions on pattern analysis and machine intelligence, vol. 28, no. 11, pp. 1768–1783, 2006.
[36] A. Iscen, G. Tolias, Y. Avrithis, T. Furon, and O. Chum, “Efficient diffusion on region manifolds: Recovering small objects with compact cnn representations,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2077–2086, 2017.
[37] K. Saenko, B. Kulis, M. Fritz, and T. Darrell, “Adapting visual category models to new domains,” in European conference on computer vision, pp. 213–226, Springer, 2010.
[38] X. Peng, B. Usman, N. Kaushik, J. Hoffman, D. Wang, and K. Saenko, “Visda: The visual domain adaptation challenge,” arXiv preprint arXiv:1710.06924, 2017.
[39] K. Saito, K. Watanabe, Y. Ushiku, and T. Harada, “Maximum classifier discrepancy for unsupervised domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3723–3732, 2018.
[40] A. Krizhevsky, G. Hinton, et al., “Learning multiple layers of features from tiny images,” 2009.
[41] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
[42] J. Johnson, M. Douze, and H. Jégou, “Billion-scale similarity search with gpus,” IEEE Transactions on Big Data, 2019.
[43] M. Long, H. Zhu, J. Wang, and M. I. Jordan, “Deep transfer learning with joint adaptation networks,” in International conference on machine learning, pp. 2208–2217, PMLR, 2017.45
[44] M. Long, H. Zhu, J. Wang, and M. I. Jordan, “Unsupervised domain adaptation with residual transfer networks,” in Advances in neural information processing systems, pp. 136–144, 2016.
[45] S. Sankaranarayanan, Y. Balaji, C. D. Castillo, and R. Chellappa, “Generate to adapt: Aligning domains using generative adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8503–8512, 2018.
[46] W. Zhang, W. Ouyang, W. Li, and D. Xu, “Collaborative and adversarial network for unsupervised domain adaptation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3801–3809, 2018.
[47] M. Long, Z. Cao, J. Wang, and M. I. Jordan, “Conditional adversarial domain adaptation,” in Advances in Neural Information Processing Systems, pp. 1640–1650, 2018.
[48] X. Chen, S. Wang, M. Long, and J. Wang, “Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation,” in International Conference on Machine Learning, pp. 1081–1090, 2019.
[49] R. Xu, G. Li, J. Yang, and L. Lin, “Larger norm more transferable: An adaptive feature norm approach for unsupervised domain adaptation,” in Proceedings of the IEEE International Conference on Computer Vision, pp. 1426–1435, 2019.
[50] G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” science, vol. 313, no. 5786, pp. 504–507, 2006.
[51] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “Decaf: A deep convolutional activation feature for generic visual recognition,” in International conference on machine learning, pp. 647–655, 2014.
[52] S. Laine and T. Aila, “Temporal ensembling for semi-supervised learning,” arXiv preprint arXiv:1610.02242, 201

全文公開日期 2023/12/15 (校內網路)
全文公開日期 2025/12/15 (校外網路)
全文公開日期 2025/12/15 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文