簡易檢索 / 詳目顯示

研究生: 郭宗憲
Zong-Xian Guo
論文名稱: 結合局部相關度損失之區域導向圖片風格轉換研究
Region-Oriented Image Style Transfer with Local Correlation Loss
指導教授: 林伯慎
Bor-Shen Lin
口試委員: 楊傳凱
Chuan-Kai Yang
陳柏琳
Ber-Lin Chen
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 50
中文關鍵詞: 風格轉換卷積神經網路
外文關鍵詞: Matting
相關次數: 點閱:192下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 風格轉移是將風格圖像映射到目標圖像的技術,它可以解析來源圖,像是藝術作品或繪畫,將其創作元素生成到指定的目標圖上,使目標圖擁有與來源圖相似的風格。L. A. Gatys提出了卷積網路風格轉換模型,然而,此方法是針對整張來源圖片與目標圖片進行風格轉移,並無法精確地控制來源圖的風格範圍、或是在目標圖片上的生成區域。
    本研究探討了在卷積網路風格轉換模型下,如何利用多層區域遮罩進行學習,以達到區域導向的風格轉移。此方法讓使用者可以控制來源圖像中欲轉移風格的範圍,以及此風格在目標圖像的生成區域。也就是說,創作者可以只改變目標圖像中指定物體或區域的風格,而不去影響到目標圖像中該物體或區域以外的風格,達到精確控制風格化圖片的生成。進一步,我們在風格轉移模型中引入局部相關損失目標函數,以改善生成影像的品質。此方法須先統計目標圖像的局部相關度資訊,存入拉普拉斯矩陣;在進行風格學習時,透過生成圖和目標圖像間局部相關度損失最小化,可使生成圖的局部相關特性接近目標圖像,而改善其品質。在加入局部相關損失函數後,我們在各組風格轉移的測試上,都可得到較清晰自然、更佳品質的風格生成圖。


    Style transfer is a technique for mapping a style image to a target image. It can analysis a source image, such as art or painting, and generate its creative elements onto a specified target image, so that the target image has a style similar to the source image. L.A. Gatys proposed a convolutional network style transfer model which can transfer the style of the entire source image onto the target image, but it can not accurately control the style region of the source image or the generation area on the target image.
    This study explores how to use multiple area masks in convolutional network style transfer model to achieve a region-oriented style transfer. Users can control the region of styles in the source image to be transferred in the specific region of the target image. That is to say, the user can change the style of the specified object or region in the target image without affecting entire target image, and accurately control the generation of the stylized image. Further, we introduce a local correlation loss objective function in the style transfer model to improve the quality of the generated image. First, we collect the local correlation information of the target image and store it in the Laplacian matrix; In the learning step, by minimizing the local correlation loss between the generated image and the target image, the local correlation of the generated image can be approximated to the target image, and the quality is improved. After we adding the local correlation loss function, we can get a clearer natural and better quality style generation image in each group of style transfer tests.

    第1章 緒論1 1.1 研究動機1 1.2 論文目的與成果簡介2 1.3 論文組織與架構4 第2章 文獻與技術背景5 2.1 卷積神經網絡5 2.1.1 VGG-NET模型介紹7 2.1.2 特徵圖像介紹8 2.2 傳統風格轉換方法10 2.3 CNN風格轉換模型11 2.3.1 內容特徵誤差計算13 2.3.2 風格特徵誤差計算14 2.3.3 CNN風格轉換模型架構15 2.3.4 CNN風格轉換模型之改進17 2.4 CNN風格轉換模型之延伸與應用17 2.4.1 語義風格轉移17 2.4.2 物體風格轉移18 2.4.3 頭像(Portrait)風格轉移18 2.4.4 色彩風格轉移18 2.4.5 語音風格轉移18 2.5 本章摘要19 第3章 區域風格轉移模型20 3.1 區域風格轉換模型方法20 3.2 資料前處理21 3.3 特徵誤差計算22 3.3.1 內容特徵誤差計算22 3.3.2 風格特徵誤差計算23 3.3.3 基於內容與風格特徵之實驗結果26 3.4 Alpha Matting誤差計算28 3.5 實驗結果30 第4章 結論與未來方向36 參考文獻37

    [1] Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
    [2] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    [3] Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei. ImageNet Large Scale Visual Recognition Challenge. arXiv:1409.0575 [cs], Sept. 2014. arXiv: 1409.0575.
    [4] Hertzmann, “Painterly rendering with curved brush strokes of multiple sizes,” in Proceedings of the 25th annual conference on Computer graphics and interactive techniques. ACM, 1998, pp. 453–460.
    [5] Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. H. Salesin, “Image analogies,” in Proceedings of the 28th annual conference on Computer graphics and inter- active techniques. ACM, 2001, pp. 327–340.

    [6] A. Efros and T. K. Leung, “Texture synthesis by non- parametric sampling,” in Proceedings of the IEEE Interna- tionalConferenceonComputerVision,vol.2. IEEE,1999, pp. 1033–1038.

    [7] L.-Y.WeiandM.Levoy,“Fast texture synthesis using treestructured vector quantization,” in Proceedings of the 27th annual conference on Computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., 2000, pp. 479–488

    [8] L. A. Gatys, A. S. Ecker, and M. Bethge, “A neural algorithm of artistic style,” ArXiv e-prints, Aug. 2015.

    [9] L. A. Gatys, A. S. Ecker, and M. Bethge, “Texture synthesis using convolutional neural networks,” in Advances in Neu- ral Information Processing Systems, 2015, pp. 262–270

    [10] L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, pages 2414–2423, 2016.

    [11] G. Berger and R. Memisevic, “Incorporating long-range consistency in cnn based texture generation,” in Interna- tional Conference on Learning Representations, 2017.
    [******Gram problem]
    [12] V. M. Patel, R. Gopalan, R. Li, and R. Chellappa, “Visual domain adaptation: A survey of recent advances,” IEEE signal processing magazine, vol. 32, no. 3, pp. 53–69, 2015 [******Gram problem]
    [13] Y. Li, N. Wang, J. Liu, and X. Hou, “Demystifying neural style transfer,” in Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-17, 2017, pp. 2230–2236.
    [14] S. Li, X. Xu, L. Nie, and T.-S. Chua, “Laplacian-steered neural style transfer,” in Proceedings of the 2017 ACM on Multimedia Conference. ACM, 2017, pp. 1716–1724.
    [******Structure problem]
    [15] L. A. Gatys, A. S. Ecker, M. Bethge, A. Hertzmann, and E. Shechtman, “Controlling perceptual factors in neural style transfer,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3985– 3993.

    [16] Y. Jing, Y. Liu, Y. Yang, Z. Feng, Y. Yu, D. Tao, and M. Song, “Stroke controllable fast style transfer with adaptive receptive fields,” arXiv preprint arXiv:1802.07101, 2018.

    [17] E. Risser, P. Wilmot, and C. Barnes, “Stable and controllable neural texture synthesis and style transfer using histogram losses,” ArXiv e-prints, Jan. 2017.
    [18] J. Champandard, “Semantic style transfer and turning two-bit doodles into fine artworks,” ArXiv e-prints, Mar. 2016

    [19] Li and M.Wand,“ Combining Markova random fields and convolutional neural networks for image synthesis,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 2479–2486.

    [20] Y.-L. Chen and C.-T. Hsu, “Towards deep style transfer: A content-aware perspective,” in Proceedings of the British Machine Vision Conference, 2016.

    [21] R. Mechrez, I. Talmi, and L. Zelnik-Manor, “The contextual loss for image transformation with non-aligned data,” arXiv preprint arXiv:1803.02077, 2018.
    [22] Castillo, S. De, X. Han, B. Singh, A. K. Yadav, and T. Goldstein, “Son of zorn’s lemma: Targeted style transfer using instance-aware semantic segmentation,” in IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 2017, pp. 1348–1352.

    [23] Selim, M. Elgharib, and L. Doyle, “Painting style transfer for head portraits using convolutional neural networks,” ACM Transactions on Graphics, vol. 35, no. 4, p. 129, 2016.

    [24] F. Luan, S. Paris, E. Shechtman, and K. Bala, “Deep photo style transfer,” in Computer Vision and Pattern Recogni- tion (CVPR), 2017 IEEE Conference on. IEEE, 2017, pp. 6997–7005.

    [25] P. Verma and J. O. Smith, “Neural style transfer for audio spectograms,” in Proceedings of the NIPSWorkshop on Machine Learning for Creativity and Design, 2017.
    [26] P. K. Mital, “Time domain neural audio style transfer,” in Proceedings of the NIPS Workshop on Machine Learning for Creativity and Design, 2018.
    Levin, D. Lischinski, and Y. Weiss. A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2):228–242, 2008.

    QR CODE