簡易檢索 / 詳目顯示

研究生: 楊智
Chih Yang
論文名稱: 基於生成對抗網路之多尺度高解析度紋理生成系統
A Multi-Scale High Resolution Texture Generative System Based on Generative Adversarial Nets
指導教授: 孫沛立
Pei-Li Sun
林宗翰
Tzung-Han Lin
口試委員: 楊傳凱
Chuan-Kai Yang
陳怡永
Yi-Yung Chen
孫沛立
Pei-Li Sun
林宗翰
Tzung-Han Lin
學位類別: 碩士
Master
系所名稱: 應用科技學院 - 色彩與照明科技研究所
Graduate Institute of Color and Illumination Technology
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 91
中文關鍵詞: 紋理合成深度學習卷積神經網路生成對抗網路
外文關鍵詞: Texture Synthesis, Deep Learning, Convolutional Neural Networks, Generative Adversarial Nets
相關次數: 點閱:333下載:16
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

紋理是物體外貌的重要特徵,從電腦圖學構成的虛擬世界中,乃至現實世界中的紡織、建材、印刷相關之產業,都有使用紋理影像的需求。然而大尺寸紋理影像取得不易,因此透過演算法產出紋理影像的技術應運而生。傳統紋理生成透過抓取樣本中部分區塊以拼湊出新的紋理,但成果容易有重複性特徵、接縫處不自然等問題,且可處理之解析度有限。
CNNs (Convolutional Neural Networks,卷積神經網路) 為一具備多尺度特徵擷取能力的深度學習 (Deep Learning) 架構。而以其為基礎的GAN (Generative Adversarial Nets,生成對抗網路) 能夠生成逼真的影像。皆是近年紋理合成研究的熱門框架。
本研究提出了一個基於生GAN的紋理生成系統,利用了影像金字塔的概念,將困難的高解析度紋理生成拆解為一個低解析度的紋理生成任務以及多尺度的超解析度 (Super Resolution) 任務,並藉由網路設計使輸出影像尺寸可自由調變
本研究達成兩個目標
一、讓以往超出硬體運算能力的高解析度紋理影像也能透過深度學習處理。
二、可生成任意尺寸之紋理影像。
而在最後,本研究也利用影像傅立葉轉換、色彩直方圖分析以及人因實驗評估生成影像,並與經典演算法比較,從多方面探討本系統的優勢以及潛力。


Texture is one of most important visual features of objects. There is a need of texture images from the virtual world to the real world, such as computer graphics, textile industries, interior decoration industries, and printing industries. However, it’s not easy to obtain large-scale texture images, the texture generation technique came into being. The traditional method is grabbing some patches in the sample texture to compose a new texture image. However, this process will cause some problems such as repetitive features and unnatural seams. Also, the resolution can be processed is limited.
CNNs (Convolutional Neural Networks) is a deep learning architecture that can extract multi-scale features from image. And GAN (Generative Adversarial Nets) which based on CNNs can generate realistic images. Both of them have been utilized in many state of the art researches of texture generation.
This research proposes a texture generation system based on GAN. The system utilizes the concept of image pyramid to decompose the difficult high-resolution generation task into a low-resolution generation task and some multi-scale super resolution tasks. Because of the design of network architecture, the output image size can be adjusted.
This research achieves two goals:
1. Can use deep learning technique to generate high-resolution images without exceeding the computing power of hardware.
2. Can generate texture images with any size.
In the end of this research, we used 2D Fourier spectrum, color histogram analysis and visual assessment to quantify and compare the performances of our proposed method and some classic algorithms to explore the advantages and potential of the proposed system in many aspects.

摘要 I ABSTRACT II 致謝 III 目錄 IV 圖目錄 VII 表目錄 IX 第一章 緒論 1 1.1 研究背景 1 1.2 研究動機與目的 2 1.3 論文架構 3 1.4 專有名詞代稱 4 第二章 文獻探討 6 2.1 EXAMPLE-BASED TEXTURE SYNTHESIS 6 2.1.1 Pixel-Based Synthesis 7 2.1.2 Patch-Based Synthesis 7 2.1.3 基於區塊差異的最佳化法 8 2.1.4 基於影像金字塔的紋理參數化與轉換法 9 2.2 基於深度學習之影像技術 10 2.2.1 Convolutional Neural Networks 10 2.2.2 Neural Style Transfer 11 2.2.3 Single Image Super-Resolution 14 2.3 基於深度學習之生成模型 17 2.3.1 概念與定義 17 2.3.2 AutoEncoder & Variational AutoEncoder 18 2.3.3 Generative Adversarial Nets 21 2.4 GAN之改良 25 2.4.1 Laplacian Pyramid GAN 25 2.4.2 Deep Convolutional GAN 26 2.4.3 Wasserstein GAN 27 2.4.4 WGAN with Gradient Penalty 28 2.5 GAN之應用 30 2.5.1 Spatial GAN & Periodic Spatial GAN 30 2.5.2 DeBlur GAN 31 第三章 前導實驗 33 3.1 實驗一: VAE之紋理合成實驗 33 3.1.1 架構 33 3.1.2 成果 34 3.1.3 小結 36 3.2 實驗二: GAN網路設計實驗 37 3.2.1 Batch Normalization 與激勵函數執行順序之影響 37 3.2.2 風格損失之影響 38 3.2.3 放大演算法之影響 38 3.2.4 小結 39 第四章 架構 40 4.1 硬體與環境 40 4.2 SPGAN系統架構 41 4.3 網路結構 43 第五章 主實驗 47 5.1 網路訓練 47 5.1.1 訓練資料 47 5.1.2 訓練資料預處理 48 5.1.3 訓練流程 49 5.1.4 損失函數 52 5.1.5 最佳化法 54 5.1.6 訓練參數設定 55 5.2 成果 56 5.3 影像傅立葉轉換與色彩直方圖分析 60 5.3.1 影像傅立葉轉換分析 60 5.3.2 色彩直方圖分析 61 5.3.3 實驗結果 62 5.3.4 小結 62 5.4 人因評估 63 5.4.1 實驗設置 63 5.4.2 實驗流程 64 5.4.3 實驗結果 65 5.4.4 小結 68 5.5 生成速度與尺寸限制 69 5.6 小結 70 第六章 結論與建議 71 6.1 結論 71 6.2 建議與未來展望 72 參考文獻 73

[1] M. Ashikhmin, “Synthesizing natural textures,” Symposium on Interactive 3D graphics (I3D), 2001. doi: 10.1145/364338.364405.
[2] L.-Y. Wei, S. Lefebvre, V. Kwatra, and G. Turk, “State of the Art in Example-based Texture Synthesis,” European Association for Computer Graphics, (Eurographics), State of the Art Reports (EG-STAR), 2009.
doi: 10.2312/egst.20091063.
[3] A. A. Efros and T. K. Leung, “Texture synthesis by non-parametric sampling,” International Conference on Computer Vision (ICCV), 1999.
doi: 10.1109/ICCV.1999.790383.
[4] V. Kwatra, I. Essa, A. Bobick, and N. Kwatra, “Texture optimization for example-based synthesis,” ACM Transactions on Graphics, vol. 24, no. 3, pp. 795–802, 2005. doi: 10.1145/1073204.1073263.
[5] L.-Y. Wei and M. Levoy, “Fast texture synthesis using tree-structured vector quantization,” Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH), 2000. doi: 10.1145/344779.345009.
[6] E. Praun, A. Finkelstein, and H. Hoppe, “Lapped textures,” Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ), 2000.
doi: 10.1145/344779.344987.
[7] L. Liang, C. Liu, Y.-Q. Xu, B. Guo, and H.-Y. Shum, “Real-time texture synthesis by patch-based sampling,” ACM Transactions on Graphics, vol. 20, no. 3, pp. 127–150, 2001. doi: 10.1145/501786.501787.
[8] A. A. Efros and W. T. Freeman, “Image quilting for texture synthesis and transfer,” Annual Conference on Computer Graphics and Interactive Technique (SIGGRAPH), 2001. doi: 10.1145/383259.383296.
[9] V. Kwatra, A. Schödl, I. Essa, G. Turk, and A. Bobick, “Graphcut Textures: Image and Video Synthesis Using Graph Cuts,” ACM Transactions on Graphics, vol. 22, no. 3, pp. 277–286, 2003. doi: 10.1145/882262.882264.
[10] Q. Wu and Y. Yu, “Feature matching and deformation for texture synthesis,” ACM Transactions on Graphics, vol. 23, no. 3, pp. 364–367, 2004.
doi: 10.1145/1015706.1015730.
[11] C. Soler, M.-P. Cani, and A. Angelidis, “Hierarchical pattern mapping,” ACM Transactions on Graphics, vol. 21, no. 3, pp. 673–680, 2002.
doi: 10.1145/566654.566635.
[12] J. Han et al., “Fast example-based surface texture synthesis via discrete optimization,” Vis Comput, vol. 22, pp. 918–925, 2006.
doi: 10.1007/s00371-006-0078-3.
[13] S. Lefebvre and H. Hoppe, “Parallel controllable texture synthesis,” ACM Transactions on Graphics, vol. 24, no. 3, pp. 777–786, 2005.
doi: 10.1145/1073204.1073261.
[14] D. J. Heeger and J. R. Bergen, “Pyramid-based texture analysis/synthesis,” International Conference on Image Processing (ICIP), 1995.
doi: 10.1109/ICIP.1995.537718.
[15] E. P. Simoncelli and W. T. Freeman, “The steerable pyramid: a flexible architecture for multi-scale derivative computation,” International Conference on Image Processing (ICIP), 1995. doi: 10.1109/ICIP.1995.537667
[16] J. Portilla and E. P. Simoncelli, “Parametric texture model based on joint statistics of complex wavelet coefficients,” International Journal of Computer Vision, vol. 40, pp. 49–70, 2000. doi: 10.1023/A:1026553619983.
[17] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” In Proc. of IEEE, vol. 86, no. 11, pp. 2278–2323, 1998. doi: 10.1109/5.726791.
[18] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image Style Transfer Using Convolutional Neural Networks,” Conference on Computer Vision and Pattern Recognition (CVPR), 2016. doi:10.1109/CVPR.2016.265.
[19] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv: 1409.1556v6 [cs.CV], 2015.
[20] L. A. Gatys, A. S. Ecker, and M. Bethge, “Texture Synthesis Using Convolutional Neural Networks,” International Conference on Neural Information Processing Systems (NIPS), 2015.
url: https://dl.acm.org/doi/10.5555/2969239.2969269.
[21] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a Deep Convolutional Network for Image Super-Resolution,” European Conference on Computer Vision (ECCV), 2014. doi: 10.1007/978-3-319-10593-2_13.
[22] J. Kim, J. K. Lee, and K. M. Lee, “Accurate Image Super-Resolution Using Very Deep Convolutional Networks,” Conference on Computer Vision and Pattern Recognition (CVPR), 2016. doi: 10.1109/CVPR.2016.182.
[23] C. Ledig et al., “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,” Conference on Computer Vision and Pattern Recognition (CVPR), 2017. doi: 10.1109/CVPR.2017.19.
[24] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” Conference on Computer Vision and Pattern Recognition (CVPR), 2016. doi: 10.1109/CVPR.2016.90.
[25] I. J. Goodfellow et al., “Generative adversarial nets,” International Conference on Neural Information Processing System (NIPS), 2014.
url: https://dl.acm.org/doi/10.5555/2969033.2969125
[26] Z. Wang, J. Chen, and S. C. H. Hoi, “Deep Learning for Image Super-resolution: A Survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence (Early Access). doi: 10.1109/TPAMI.2020.2982166.
[27] T. Karras, T. Aila, S. Laine, and J. Lehtinen, “Progressive growing of GANs for improved quality, stability, and variation,” International Conference on Learning Representations(ICLR), 2018.
url: https://openreview.net/forum?id=Hk99zCeAb.
[28] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture for generative adversarial networks,” Conference on Computer Vision and Pattern Recognition (CVPR), 2019. doi: 10.1109/CVPR.2019.00453.
[29] H. Zhang, Z. Hu, C. Luo, W. Zuo, and M. Wang, “Semantic image inpainting with progressive generative networks,” ACM international Conference on Multimedia (MM), 2018. doi: 10.1145/3240508.3240625.
[30] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv:1312.6114v10 [stat.ML], 2013.
[31] I. Goodfellow, Y. Bengio, and A. Courville, “Autoencoders,” in Deep Learning, MIT Press, 2016, pp. 499–523. url: https://www.deeplearningbook.org.
[32] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, and J. Matas, “DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks,” Conference on Computer Vision and Pattern Recognition(CVPR), 2018.
doi: 10.1109/CVPR.2018.00854.
[33] Z. Xu, M. Wilber, C. Fang, A. Hertzmann, and H. Jin, “Learning from Multi-domain Artistic Images for Arbitrary Style Transfer,” Expressive Symposium on Computational Aesthetics and Sketch Based Interfaces and Modeling and Non-Photorealistic Animation and Rendering (Expressive), 2019.
doi: 10.2312/exp.20191073.
[34] Z. Zhang, Z. Wang, Z. Lin, and H. Qi, “Image super-resolution by neural texture transfer,” Conference on Computer Vision and Pattern Recognition (CVPR), 2019. doi: 10.1109/CVPR.2019.00817.
[35] E. Denton, A. Szlam, R. Fergus, S. Chintala, A. Szlam, and R. Fergus, “Deep Generative Image Models using a of Laplacian Pyramid Adversarial Networks,” International Conference on Neural Information Processing Systems (NIPS), 2015. url: https://dl.acm.org/doi/10.5555/2969239.2969405.
[36] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv:1511.06434v2 [cs.LG], 2015.
[37] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” International Conference on Machine Learning (ICML), 2015.
url: https://dl.acm.org/doi/10.5555/3045118.3045167.
[38] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein generative adversarial networks,” International Conference on Machine Learning (ICML), 2015.
url: https://dl.acm.org/doi/10.5555/3305381.3305404.
[39] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville, “Improved training of wasserstein GANs,” International Conference on Neural Information Processing Systems (NIPS), 2017.
url: https://dl.acm.org/doi/10.5555/3295222.3295327.
[40] N. Jetchev, U. Bergmann, and R. Vollgraf, “Texture Synthesis with Spatial Generative Adversarial Networks,” arXiv:1611.08207 [cs.CV], 2016.
[41] U. Bergmann, N. Jetchev, and R. Vollgraf, “Learning texture manifolds with the periodic spatial GAN,” arXiv:1705.06566v2 [cs.CV], 2017.
[42] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015. doi:10.1007/978-3-319-24574-4_28.
[43] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations (ICLR), 2015.
url: http://arxiv.org/abs/1412.6980.
[44] Y. Rubner and C. Tomasi, “Texture metrics,” International Conference on Systems, Man, and Cybernetics (ICSMC), 1998. doi:10.1109/ICSMC.1998.727577.

QR CODE