簡易檢索 / 詳目顯示

研究生: 洪昱翔
Yu-Hsiang Hung
論文名稱: 針對雜訊運用反向學習非監督式圖像轉換
Unpaired Image-to-Image Translation using Negative Learning for Noisy Patches
指導教授: 花凱龍
Kai-Lung Hua
口試委員: 陳駿丞
Jun-Cheng Chen
鍾國亮
Kuo-Liang Chung
陳祝嵩
Chu-Song Chen
陳俊仰
Scott Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 34
中文關鍵詞: 影像生成反向學習生成對抗網路
外文關鍵詞: image generation, negative learning, generative adversarial networks
相關次數: 點閱:441下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

非監督式圖像轉換的目標是在兩個非成對的數據樣本中找到對應關係。一種方法是以補丁為單位的對比學習來達成單向的圖像轉換,最大化輸入與輸出影像中相對應之補丁的相互資訊,且將其他來自輸入影像中的非對應補丁視為負樣本。由於上述方法在選取樣本時是隨機選取,這很也可能會導致再做對比學習時將語意與正樣本相似的補丁視為負樣本,例如成對的眉毛、眼睛與臉頰等等。為了解決這個問題,我們提出了以補丁為單位的反向學習,不僅最大化正樣本且最小化負樣本的相同資訊,我們另外選擇一個非對應的負樣本且最大化它的不相同資訊。如此一來,我們可以減少選擇到錯的負樣本的機會也可以確保模型不會擬合到負樣本上面,透過實驗結果表示我們的方法優於其他流行的模型。


The goal of unpaired image-to-image translation is to find a mapping between two domains that do not have paired data samples. One approach is patchwise contrastive learning, a one-sided translation that maximizes mutual information between the corresponding input and output patches. Non-corresponding patches are referred to as negatives. Previous approaches select these patches randomly, resulting into semantically-similar patches, such as eyebrows, eyes, and cheeks, being wrongly labelled as negatives during contrastive learning. To address this issue, we propose patchwise negative learning loss, a novel loss based on negative learning. Unlike prior methods which only maximize mutual information between corresponding patches and naively minimize between all non-corresponding ones, we additionally choose one non-corresponding negative patch and maximize dissimilarity with the query patch. In doing so, we reduce the chance of selecting false negatives that contains high mutual information. By further maximizing dissimilarity with a single negative, we discourage our model from fitting on those noisy negative patches. Our experiments show that our method outperforms state-of-the-art methods.

摘要 Abstract Acknowledgements Table of Contents List of Tables List of Illustrations 1 Introduction 2 Related Work 3 Methodology 3.1 Adversarial Loss 3.2 Patchwise Positive Learning Loss 3.3 Patchwise Negative Learning Loss 3.4 Superpixel-wise Contrastive Loss 3.5 Final objective 4 Result and Discussion 4.1 Experiment Setup 4.2 Ablation Study 4.3 Comparison with State-of-the-Art Models 5 Conclusion References

[1] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Süsstrunk, “Slic superpixels compared to state-of-the-art superpixel methods,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 34, no. 11, pp. 2274–2282, 2012.
[2] T. Park, A. A. Efros, R. Zhang, and J.-Y. Zhu, “Contrastive learning for unpaired image-to-image translation,” in European Conference on Computer Vision (ECCV), 2020.
[3] O. Nizan and A. Tal, “Breaking the cycle - colleagues are all you need,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[4] Y. Zhao, R. Wu, and H. Dong, “Unpaired image-to-image translation using adversarial consistency loss,” in European Conference on Computer Vision (ECCV), 2020.
[5] J. Kim, M. Kim, H. Kang, and K. H. Lee, “U-gat-it: Unsupervised generative attentional networks with adaptive layer-instance normalization for image-to-image translation,” in International Conference on Learning Representations (ICLR), 2020.
[6] H.-Y. Lee, H.-Y. Tseng, Q. Mao, J.-B. Huang, Y.-D. Lu, M. K. Singh, and M. H. Yang, “Drit++: Diverse image-to-image translation viadisentangled representations,” International Journal of Computer Vision (IJCV), pp. 1–16, 2020.
[7] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in IEEE International Conference on Computer Vision (ICCV), 2017.
[8] M.-Y. Liu, T. Breuel, and J. Kautz, “Unsupervised image-to-image translation networks,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.
[9] S. Benaim and L. Wolf, “One-sided unsupervised domain mapping,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.
[10] X. Huang, M.-Y. Liu, S. Belongie, and J. Kautz, “Multimodal unsupervised image-to-image translation,” in European Conference on Computer Vision (ECCV), 2018.
[11] Z. Yi, H. Zhang, P. Tan, and M. Gong, “Dualgan: Unsupervised dual learning for image-to-image translation,” in IEEE International Conference on Computer Vision (ICCV), 2017.
[12] T. Kim, M. Cha, H. Kim, J. K. Lee, and J. Kim, “Learning to discover cross-domain relations with generative adversarial networks,” in International Conference on Machine Learning (ICML), 2017.
[13] Y. Kim, J. Yim, J. Yun, and J. Kim, “Nlnl: Negative learning for noisy labels,” in IEEE International Conference on Computer Vision (ICCV), 2019, pp. 101–110.
[14] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial networks,” in Advances in Neural Information Processing Systems (NeurIPS), 2014.
[15] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[16] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “High resolution image synthesis and semantic manipulation with conditional gans,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[17] T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, G. Liu, A. Tao, J. Kautz, and B. Catanzaro, “Video-to-video synthesis,” in Advances in Neural Information Processing Systems (NeurIPS), 2018.
[18] H.-Y. Lee, H.-Y. Tseng, J.-B. Huang, M. K. Singh, and M.-H. Yang, “Diverse image-to-image translation via disentangled representations,” in European Conference on Computer Vision (ECCV), 2018.
[19] M. Amodio and S. Krishnaswamy, “Travelgan: Image-to-image translation by transformation vector learning,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[20] H. Fu, M. Gong, C. Wang, K. Batmanghelich, K. Zhang, and D. Tao, “Geometry-consistent generative adversarial networks for one-sided unsupervised domain mapping,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
[21] R. Yi, Y.-J. Liu, Y.-K. Lai, and P. L. Rosin, “Unpaired portrait drawing generation via asymmetric cycle mapping,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[22] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley, “Least squares generative adversarial networks,” in IEEE International Conference on Computer Vision (ICCV), 2017.
[23] J. Deng, W. Dong, R. Socher, L. Li, Kai Li, and Li Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248–255.
[24] Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, “Stargan v2: Diverse image synthesis for multiple domains,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[25] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, “Gans trained by a two time-scale update rule converge to a local nash equilibrium,” in Advances in Neural Information Processing Systems (NeurIPS), 2017.
[26] M. Bińkowski, D. J. Sutherland, M. Arbel, and A. Gretton, “Demystifying mmd gans,” in International Conference on Learning Representations (ICLR), 2018.

QR CODE