SceneGAN：以對抗式生成網路所產生之風景照風格轉換｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	鄭御廷 Yu-Ting Cheng
論文名稱：	SceneGAN：以對抗式生成網路所產生之風景照風格轉換 SceneGAN: Scene Image Style Transfer using Generative Adversarial Networks
指導教授：	楊傳凱 chuan-kai yang
口試委員:	賴源正 Yuan-Zheng Lai 羅乃維 Nai-Wei Luo
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理系 Department of Information Management
論文出版年：	2018
畢業學年度：	106
語文別：	中文
論文頁數：	44頁
中文關鍵詞：	影像部分風格轉移、對抗式生成網路
外文關鍵詞：	image partial style transfer, GAN
相關次數：	點閱：188 下載：9
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

近期許多深度學習中模型在影像風格轉移和圖片超解析上取得重大的成功。但在這些方法中，有兩個問題是會常常遇到的：(1)會受到GPU的空間限制，導致深度學習在影像處理的應用圖片只能在128px以內。(2)影像風格轉移時，大部分只能整張圖片進行風格轉移，無法針對單一物件、背景或是照片給人的感覺進行風格轉移。

因此本論文提出SceneGAN，一個創新且具規模性的方法，透過單一模型進行圖片局部風格轉移並使用圖片超解析進行圖片放大。這種單一的模型使得SceneGAN相較於其他的模型能狗產生高品質的風格轉移，並解決生成256px圖片以上需要較高的GPU空間的限制，讓有生成128px圖片的硬體設備也可以產生256px或是更高畫素的圖片。

In recent years, many deep learning models have achieved great success in image style transfer and image super resolution. However, in these methods, two problems are often encountered :(1) the memory size of GPU is limited, resulting in the fact that applications of deep learning can only deal with images of resolution of 128px. (2) when the image style is transferred, mostly of the images can only be transferred as a whole, but can not be transferred to a single object or to the background.

Therefore, this paper proposes SceneGAN, an innovative method which uses a single model to transfer image styles both locally and globally. This single model makes SceneGAN, when compared with other models, can produce higher quality of style transfer, and solve the problem of generating images with resolutions of 256 px or more by modifying the popular StarGAN model.

中文摘要
英文摘要
誌謝
目錄
圖目錄
表目錄
第一章 緒論
1  硏究動機與目的
2  論文貢獻
3  論文章節架構
第二章 文獻探討
1  對抗式生成綱路
1.1     Conditional GAN
1.2  Deep Convolutional GAN
2  圖像到圖像的轉換
2.1  pix2pix
2.2  CycleGAN
2.3  StarGAN
第三章  SceneGAN演算法設計與系統實作
1  系統流程
2    SceneA資料集
3    SceneA屬性
3.1  風景照的特徵
3.2  SceneA標籤設計
4    GPU之memory控制
4.1  輸入圖片大小
4.2  神經綱路層的數目
4.3  Batch Size
5    Deep  learning訓練之epoch
6    SceneGAN設計
7  多領域圖像對圖像轉換
7.1  對抗式損失
7.2  領域分類損失
7.3  圖像重建損失
7.4  完整目標
8    StarGAN設計
8.1  StarGAN的基線模型設計
8.2  StarGAN綱路架構
8.3  遮單向量
8.4  訓練策略
8.5  改善GAN的訓練
9    StarGAN的變形
10 平衡對抗性損失
第四章 結杲展示與評估
1  系統環境
2  系統參數設置
3  結杲展示與評估
3.1    StarGAN有無平衡對抗式損失的比較
3.2    SceneGAN和StarGAN的比較
3.3    SceneGAN有無平衡對抗式損失的比較
3.4    SceneGAN有平衡對抗式損失不好的結杲圖
第五章 結論與未來展望
參考文獻

                                

[1] Yunjey Choi, Min-Je Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. CoRR, abs/1711.09020, 2017.
[2] Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. Image super- resolution using deep convolutional networks. CoRR, abs/1501.00092, 2015.
[3] Chao Dong, Chen Change Loy, and Xiaoou Tang. Accelerating the super- resolution convolutional neural network. CoRR, abs/1608.00367, 2016.
[4] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and
K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 2672–2680. Curran Associates, Inc., 2014.
[5] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.
[6] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. Image-to-image translation with conditional adversarial networks. CoRR, abs/1611.07004, 2016.
[7] Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, and Jiwon Kim. Learning to discover cross-domain relations with generative adversarial networks. CoRR, abs/1703.05192, 2017.
[8] Diederik P. Kingma and Max Welling. Auto-encoding variational bayes. CoRR, abs/1312.6114, 2013.
[9] Oliver Langner, Ron Dotsch, Gijsbert Bijlstra, Daniel H. J. Wigboldus, Skyler T. Hawk, and Ad van Knippenberg. Presentation and validation of the radboud faces database. Cognition and Emotion, 24(8):1377–1388, 2010.
[10] Mu Li, Wangmeng Zuo, and David Zhang. Deep identity-aware transfer of facial attributes. CoRR, abs/1610.05586, 2016.
[11] Ming-Yu Liu, Thomas Breuel, and Jan Kautz. Unsupervised image-to-image translation networks. CoRR, abs/1703.00848, 2017.
[12] Ming-Yu Liu and Oncel Tuzel. Coupled generative adversarial networks. CoRR, abs/1606.07536, 2016.
[13] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), 2015.
[14] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets.
CoRR, abs/1411.1784, 2014.
[15] Mahesh Chandra Mukkamala and Matthias Hein. Variants of RMSProp and Adagrad with logarithmic regret bounds. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 2545–2553, International Convention Centre, Sydney, Australia, 06–11 Aug 2017. PMLR.
[16] Guim Perarnau, Joost van de Weijer, Bogdan Raducanu, and Jose M. A´lvarez. Invertible conditional gans for image editing. CoRR, abs/1611.06355, 2016.
[17] Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. CoRR, abs/1511.06434, 2015.
[18] Martin Riedmiller and Heinrich Braun. A direct adaptive method for faster backpropagation learning: The rprop algorithm. In IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, pages 586–591, 1993.
[19] O. Ronneberger, P.Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer- Assisted Intervention (MICCAI), volume 9351 of LNCS, pages 234–241. Springer, 2015. (available on arXiv:1505.04597 [cs.CV]).
[20] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott E. Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. Going deeper with convolutions. CoRR, abs/1409.4842, 2014.
[21] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. Unpaired image-to-image translation using cycle-consistent adversarial networks. CoRR, abs/1703.10593, 2017.

簡易檢索 / 詳目顯示

相關論文