簡易檢索 / 詳目顯示

研究生: 陳玟伶
Wen-Ling Chen
論文名稱: 基於多尺度生成對抗網路與邊緣線索之圖像修補技術
Image Completion via Multi­Scale Generative Adversarial Networks and Edge Cues
指導教授: 花凱龍
Kai-Lung Hua
口試委員: 簡士哲
Shih-Che Chien
楊朝龍
Chao-Lung Yang
陸敬互
Ching-Hu Lu
陳永耀
Yung-Yao chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 44
中文關鍵詞: 圖像修復邊緣線索生成對抗網路
外文關鍵詞: image completion, edge cues, generative adversarial networks
相關次數: 點閱:388下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

最近基於深度學習的方法已經在圖像修復領域非常進步。然而,目前存在的方法時常在修補完圖片之後,遮罩區域的邊界仍然是模糊的,甚至容易生成扭曲的結構。這主要是因為過去方法使用卷積神經網路從遮罩周圍的空間複製信息修補的無效導致,對於較具語義的圖片(人臉)無法利用周圍已知訊息的修補方法
來達成。因此,圖像中的輪廓線索是相當重要的,我們可以透過此訊息更明確的知道物體之間的界線。在這篇論文中,我們提出兩階段的圖像修補架構,分別為邊緣修補網路與多尺度的圖像修補網路。邊緣修補網路主要生成缺失區域的邊緣輪廓,我們利用鉸鏈損失進行訓練以確定生成結果的真偽,同時使修補完成的邊緣更加逼真。此外,圖像修補網絡從低維度到高維度漸進的生成圖像,利用上一階段修補完成的邊緣作為條件放入多尺度圖像修補網絡一起訓練,使得缺失部分的邊界更合理,同時增加網路中的感受野。實驗結果表明在不同遮罩大小情況下,我們的方法可以生成比現有技術方法更好的結果。


Recent deep learning­based approaches have shown significant improvements in image completion. However, the existing methods often create distorted structures or blurry textures inconsistent with surrounding areas. This is mainly due to the ineffectiveness of convolutional neural networks in copying information from distant spatial locations. Therefore, the contour cues in the image are quite important, and we can more definitely confirm the boundary between objects by this information. In this thesis, we propose a two­-stage architecture for image completion, which is the edge completion network and the coarse­to­fine image completion network. Edge completion network generates edges in missing regions, we use hinge loss for training to determine whether the input is real or fake, it also makes the completed edges more realistic. Then, image completion net­work generates an image from low dimension to high dimension, the completed edges are as a condition fed into both coarse network and refine network that makes the boundary of the missing parts more reasonable, meanwhile increase larger receptive field. Experi­ment results show that our method can generate better quality images than the state-­of-­art approaches in both quantitatively and qualitatively.

Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1 Testing and Training Framework . . . . . . . . . . . . . . . . . . . . . . 4 3.2 Edge Completion Network . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.3 Coarse­to­fine Image Completion Network . . . . . . . . . . . . . . . . 10 4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.1 Pre­processing for training data . . . . . . . . . . . . . . . . . . 14 4.1.2 Mask generation . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4 Comparison with State­of­the­art Methods . . . . . . . . . . . . . . . . . 21 4.4.1 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . 21 4.4.2 Qualitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . 21 vi5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

[1] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proceedings of
the 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3730–3738, 2015.
[2] C. Doersch, S. Singh, A. Gupta, J. Sivic, and A. A. Efros, “What makes paris look like paris?,” ACM
Transactions on Graphics (Proc. SIGGRAPH), vol. 31, pp. 101:1–101:9, Jul 2012.
[3] S. Iizuka, E. Simo­Serra, and H. Ishikawa, “Globally and locally consistent image completion,” ACM
Transactions on Graphics (Proc. SIGGRAPH), vol. 36, pp. 107:1–107:14, jul 2017.
[4] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, “Generative image inpainting with contextual
attention,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[5] K. Nazeri, E. Ng, T. Joseph, F. Qureshi, and M. Ebrahimi, “Edgeconnect: Generative image inpainting
with adversarial edge learning,” arXiv:1901.00212 [cs.CV], 2019.
[6] C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman, “Patchmatch: A randomized correspon­
dence algorithm for structural image editing,” ACM Transactions on Graphics (Proc. SIGGRAPH),
vol. 28, aug 2009.
[7] A. Newson, A. Almansa, M. Fradet, Y. Gousseau, and P. Pérez, “Video inpainting of complex scenes,”
SIAM Journal of Imaging Science, vol. 7, no. 4, pp. 1993–2019, 2014.
[8] R. Yeh, C. Chen, T. Yian Lim, M. Hasegawa­Johnson, and M. N. Do, “Semantic image inpainting with
perceptual and contextual losses,” arXiv:1607.07539 [cs.CV], 07 2016.
[9] D. Simakov, Y. Caspi, E. Shechtman, and M. Irani, “Summarizing visual data using bidirectional
similarity.,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8, 2008.
[10] J.­B. Huang, S. B. Kang, N. Ahuja, and J. Kopf, “Image completion using planar structure guidance.,”
ACM Transactions on Graphics (Proc. SIGGRAPH), vol. 33, no. 4, pp. 129:1–129:10, 2014.
[11] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. Efros, “Context encoders:feature learning
by inpainting,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 2536–
2544, 2016.
[12] G. Liu, F. A. Reda, K. J. Shih, T.­C. Wang, A. Tao, and B. Catanzaro, “Image inpainting for irregular
holes using partial convolutions,” in European Conference on Computer Vision (ECCV), September
2018.
[13] I. Goodfellow, J. Pouget­Abadie, M. Mirza, B. Xu, D. Warde­Farley, S. Ozair, A. Courville, and
Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems,
pp. 2672–2680, 2014.
[14] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, and J. Verdera, “Filling­in by joint interpolation of
vector fields and gray levels,” Trans. Image Process., vol. 10, pp. 1200–1211, aug 2001.
32[15] L. Xu, J. S. Ren, C. Liu, and J. Jia, “Deep convolutional neural network for image deconvolution,”
in Advances in Neural Information Processing Systems 27, pp. 1790–1798, Curran Associates, Inc.,
2014.
[16] C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li, “High­resolution image inpainting using
multi­scale neural patch synthesis,” in IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), July 2017.
[17] J. Johnson, A. Alahi, and L. Fei­Fei, “Perceptual losses for real­time style transfer and super­
resolution,” in European Conference on Computer Vision (ECCV), 2016.
[18] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Con­
ference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016.
[19] P. Isola, J.­Y. Zhu, T. Zhou, and A. A. Efros, “Image­to­image translation with conditional adversarial
networks,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967–5976,
2017.
[20] T.­C. Wang, M.­Y. Liu, J.­Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “High­resolution image synthe­
sis and semantic manipulation with conditional gans,” in IEEE Conference on Computer Vision and
Pattern Recognition(CVPR), 2018.
[21] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adversarial
networks,” in International Conference on Learning Representations(ICLR), 2018.
[22] J. H. Lim and J. C. Ye, “Geometric gan,” arXiv:1806.03589 [stat.ML], 2017.
[23] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,”
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414–2423, 2016.
[24] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large­scale image recogni­
tion,” in International Conference on Learning Representations(ICLR), 2015.
[25] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla,
M. Bernstein, A. C. Berg, and L. Fei­Fei, “Imagenet large scale visual recognition challenge,” Inter­
national Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, 2015.
[26] X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. C. Loy, “Esrgan: Enhanced super­
resolution generative adversarial networks,” in European Conference on Computer Vision Workshops
(ECCVW), September 2018.
[27] M. S. M. Sajjadi, B. Schölkopf, and M. Hirsch, “Enhancenet: Single image super­resolution through
automated texture synthesis,” p. 4501–4510, 12 2017.
[28] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980 [cs.LG],
2014.
[29] J. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and
Machine Intelligence(TPAMI), vol. PAMI­8, pp. 679–698, Nov 1986.

無法下載圖示 全文公開日期 2024/08/21 (校內網路)
全文公開日期 2024/08/21 (校外網路)
全文公開日期 2024/08/21 (國家圖書館:臺灣博碩士論文系統)
QR CODE