研究生: |
楊庭嘉 Ting-Jia Yang |
---|---|
論文名稱: |
基於生成對抗網路之條件式高解析度圖像生成 Conditional High-Resolution Image Synthesizing Based on Generative Adversarial Networks |
指導教授: |
王乃堅
Nai-Jian Wang |
口試委員: |
蘇順豐
Shun-Feng Su 鍾順平 Shun-Ping Chung 郭景明 Jing-Ming Guo 方劭云 Shao-Yun Fang 王乃堅 Nai-Jian Wang |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2018 |
畢業學年度: | 106 |
語文別: | 英文 |
論文頁數: | 49 |
中文關鍵詞: | 圖像生成 、生成對抗網路 、條件式圖像生成 |
外文關鍵詞: | image generation, generative adversarial network, conditional image generation |
相關次數: | 點閱:344 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本論文實作一個可根據使用者輸入條件而生成高解析度人臉圖像的系統。我們首先實
作一個基於漸進式生長生成對抗網路以及Wasserstein距離的高解析圖像生成器,並使
用CelebA資料庫訓練其生成高解析度的人臉圖像。為了控制此模型的的輸出,使其生
成根據使用者所提供之各種人臉特徵描述的圖像,我們加入第二組生成對抗網路,學
習圖像生成器的條件潛在機率分布。我們訓練此次要生成對抗網路,使其根據輸入的
特徵向量,合成一組潛在向量作為圖像生成器的輸入以對輸出的圖像進行操控。我們
以此法在CelebA及MNIST資料庫上進行驗證。實驗結果顯示此方法可將一個現有的生
成模型轉為條件式生成模型,並仍保留足夠的生成樣本多樣性。此外,此方法允許我
們導入來自多個不同資料庫的標籤,而不需重新訓練圖像生成器
We present a conditional high resolution image generating system that can synthesize
face images based on input of a wide range of facial attributes.
First we focus on the training and optimization of an image generation model that synthe
sizes high resolution images based on the Wasserstein distance and progressive growing
method. Then to exert control over the image generation model, we deviate from the
current trend and instead explore the viability of augmenting the image generation model
by introducing a second generative adversarial network that learns its conditional latent
space distribution in hope to manipulate the attributes of its output images.
Experiments were conducted on CelebA and MNIST dataset. Results show that our
method can convert an existing generative model to a conditional generative model that
synthesize images based on input class or attributes while retaining reasonable diversity.
Furthermore, this method allows one to incorporate labels from multiple datasets without
retraining the generative model from scratch.
[1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep
convolutional neural networks,” in Advances in neural information processing sys
tems, 2012, pp. 1097–1105.
[2] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural
information processing systems, 2014, pp. 2672–2680.
[3] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” arXiv preprint
arXiv:1701.07875, 2017.
[4] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved
training of wasserstein gans,” in Advances in Neural Information Processing Systems,
2017, pp. 5769–5779.
[5] J. Zhao, M. Mathieu, and Y. LeCun, “Energy-based generative adversarial network,”
arXiv preprint arXiv:1609.03126, 2016.
[6] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learn
ing with deep convolutional generative adversarial networks,” arXiv preprint
arXiv:1511.06434, 2015.
[7] D. Berthelot, T. Schumm, and L. Metz, “Began: boundary equilibrium generative
adversarial networks,” arXiv preprint arXiv:1703.10717, 2017.
[8] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of gans for
improved quality, stability, and variation,” arXiv preprint arXiv:1710.10196, 2017.
[9] H. Zhang, T. Xu, H. Li, S. Zhang, X. Huang, X. Wang, and D. Metaxas, “Stack
gan: Text to photo-realistic image synthesis with stacked generative adversarial
networks,” arXiv preprint, 2017.
[10] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint
arXiv:1411.1784, 2014.
[11] A. Odena, C. Olah, and J. Shlens, “Conditional image synthesis with auxiliary clas
sifier gans,” arXiv preprint arXiv:1610.09585, 2016.
[12] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, and P. Abbeel, “In
fogan: Interpretable representation learning by information maximizing generative
adversarial nets,” in Advances in Neural Information Processing Systems, 2016, pp.
2172–2180.
[13] Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for
image quality assessment,” in Signals, Systems and Computers, 2004. Conference
Record of the Thirty-Seventh Asilomar Conference on, vol. 2. Ieee, 2003, pp. 1398–
1402.
[14] J. Poland, “Three different algorithms for generating uniformly distributed random
points on the n-sphere,” unpublished note available on the internet, 2000.
[15] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”
in Proceedings of the IEEE conference on computer vision and pattern recognition,
2016, pp. 770–778.
[16] K. He, X. Zhang, and S. Ren, “Delving deep into rectifiers: Surpassing human-level
performance on imagenet classification,” in Proceedings of the IEEE international
conference on computer vision, 2015, pp. 1026–1034.
[17] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. Corrado,
A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving,
M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Man´e,
R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner,
I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Vi´egas,
O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” 2015. [Online].
Available: http://download.tensorflow.org/paper/whitepaper2015.pdf
[18] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,”
in Proceedings of International Conference on Computer Vision (ICCV), 2015.
[19] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training
by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.
[20] A. Krogh and J. A. Hertz, “A simple weight decay can improve generalization,” in
Advances in neural information processing systems, 1992, pp. 950–957.
[21] R. Kohavi et al., “A study of cross-validation and bootstrap for accuracy estimation
and model selection,” in Ijcai, vol. 14, no. 2. Montreal, Canada, 1995, pp. 1137–
1145.
[22] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov,
“Dropout: A simple way to prevent neural networks from overfitting,” The Journal
of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.