基於多尺度生成對抗網路與邊緣線索之圖像修補技術｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	陳玟伶 Wen-Ling Chen
論文名稱：	基於多尺度生成對抗網路與邊緣線索之圖像修補技術 Image Completion via MultiScale Generative Adversarial Networks and Edge Cues
指導教授：	花凱龍 Kai-Lung Hua
口試委員:	簡士哲 Shih-Che Chien 楊朝龍 Chao-Lung Yang 陸敬互 Ching-Hu Lu 陳永耀 Yung-Yao chen
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	44
中文關鍵詞：	圖像修復、邊緣線索、生成對抗網路
外文關鍵詞：	image completion, edge cues, generative adversarial networks
相關次數：	點閱：387 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

最近基於深度學習的方法已經在圖像修復領域非常進步。然而，目前存在的方法時常在修補完圖片之後，遮罩區域的邊界仍然是模糊的，甚至容易生成扭曲的結構。這主要是因為過去方法使用卷積神經網路從遮罩周圍的空間複製信息修補的無效導致，對於較具語義的圖片（人臉）無法利用周圍已知訊息的修補方法
來達成。因此，圖像中的輪廓線索是相當重要的，我們可以透過此訊息更明確的知道物體之間的界線。在這篇論文中，我們提出兩階段的圖像修補架構，分別為邊緣修補網路與多尺度的圖像修補網路。邊緣修補網路主要生成缺失區域的邊緣輪廓，我們利用鉸鏈損失進行訓練以確定生成結果的真偽，同時使修補完成的邊緣更加逼真。此外，圖像修補網絡從低維度到高維度漸進的生成圖像，利用上一階段修補完成的邊緣作為條件放入多尺度圖像修補網絡一起訓練，使得缺失部分的邊界更合理，同時增加網路中的感受野。實驗結果表明在不同遮罩大小情況下，我們的方法可以生成比現有技術方法更好的結果。

Recent deep learningbased approaches have shown significant improvements in image completion. However, the existing methods often create distorted structures or blurry textures inconsistent with surrounding areas. This is mainly due to the ineffectiveness of convolutional neural networks in copying information from distant spatial locations. Therefore, the contour cues in the image are quite important, and we can more definitely confirm the boundary between objects by this information. In this thesis, we propose a two-stage architecture for image completion, which is the edge completion network and the coarsetofine image completion network. Edge completion network generates edges in missing regions, we use hinge loss for training to determine whether the input is real or fake, it also makes the completed edges more realistic. Then, image completion network generates an image from low dimension to high dimension, the completed edges are as a condition fed into both coarse network and refine network that makes the boundary of the missing parts more reasonable, meanwhile increase larger receptive field. Experiment results show that our method can generate better quality images than the state-of-art approaches in both quantitatively and qualitatively.

Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1 Testing and Training Framework . . . . . . . . . . . . . . . . . . . . . . 4
3.2 Edge Completion Network . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Coarse­to­fine Image Completion Network . . . . . . . . . . . . . . . . 10
4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1.1 Pre­processing for training data . . . . . . . . . . . . . . . . . . 14
4.1.2 Mask generation . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4 Comparison with State­of­the­art Methods . . . . . . . . . . . . . . . . . 21
4.4.1 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . . 21
4.4.2 Qualitative Evaluation . . . . . . . . . . . . . . . . . . . . . . . 21
vi5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
                                

[1] Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in Proceedings of
the 2015 IEEE International Conference on Computer Vision (ICCV), pp. 3730–3738, 2015.
[2] C. Doersch, S. Singh, A. Gupta, J. Sivic, and A. A. Efros, “What makes paris look like paris?,” ACM
Transactions on Graphics (Proc. SIGGRAPH), vol. 31, pp. 101:1–101:9, Jul 2012.
[3] S. Iizuka, E. SimoSerra, and H. Ishikawa, “Globally and locally consistent image completion,” ACM
Transactions on Graphics (Proc. SIGGRAPH), vol. 36, pp. 107:1–107:14, jul 2017.
[4] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, “Generative image inpainting with contextual
attention,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
[5] K. Nazeri, E. Ng, T. Joseph, F. Qureshi, and M. Ebrahimi, “Edgeconnect: Generative image inpainting
with adversarial edge learning,” arXiv:1901.00212 [cs.CV], 2019.
[6] C. Barnes, E. Shechtman, A. Finkelstein, and D. B. Goldman, “Patchmatch: A randomized correspon
dence algorithm for structural image editing,” ACM Transactions on Graphics (Proc. SIGGRAPH),
vol. 28, aug 2009.
[7] A. Newson, A. Almansa, M. Fradet, Y. Gousseau, and P. Pérez, “Video inpainting of complex scenes,”
SIAM Journal of Imaging Science, vol. 7, no. 4, pp. 1993–2019, 2014.
[8] R. Yeh, C. Chen, T. Yian Lim, M. HasegawaJohnson, and M. N. Do, “Semantic image inpainting with
perceptual and contextual losses,” arXiv:1607.07539 [cs.CV], 07 2016.
[9] D. Simakov, Y. Caspi, E. Shechtman, and M. Irani, “Summarizing visual data using bidirectional
similarity.,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–8, 2008.
[10] J.B. Huang, S. B. Kang, N. Ahuja, and J. Kopf, “Image completion using planar structure guidance.,”
ACM Transactions on Graphics (Proc. SIGGRAPH), vol. 33, no. 4, pp. 129:1–129:10, 2014.
[11] D. Pathak, P. Krähenbühl, J. Donahue, T. Darrell, and A. Efros, “Context encoders:feature learning
by inpainting,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 2536–
2544, 2016.
[12] G. Liu, F. A. Reda, K. J. Shih, T.C. Wang, A. Tao, and B. Catanzaro, “Image inpainting for irregular
holes using partial convolutions,” in European Conference on Computer Vision (ECCV), September
2018.
[13] I. Goodfellow, J. PougetAbadie, M. Mirza, B. Xu, D. WardeFarley, S. Ozair, A. Courville, and
Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems,
pp. 2672–2680, 2014.
[14] C. Ballester, M. Bertalmio, V. Caselles, G. Sapiro, and J. Verdera, “Fillingin by joint interpolation of
vector fields and gray levels,” Trans. Image Process., vol. 10, pp. 1200–1211, aug 2001.
32[15] L. Xu, J. S. Ren, C. Liu, and J. Jia, “Deep convolutional neural network for image deconvolution,”
in Advances in Neural Information Processing Systems 27, pp. 1790–1798, Curran Associates, Inc.,
2014.
[16] C. Yang, X. Lu, Z. Lin, E. Shechtman, O. Wang, and H. Li, “Highresolution image inpainting using
multiscale neural patch synthesis,” in IEEE Conference on Computer Vision and Pattern Recognition
(CVPR), July 2017.
[17] J. Johnson, A. Alahi, and L. FeiFei, “Perceptual losses for realtime style transfer and super
resolution,” in European Conference on Computer Vision (ECCV), 2016.
[18] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Con
ference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, June 2016.
[19] P. Isola, J.Y. Zhu, T. Zhou, and A. A. Efros, “Imagetoimage translation with conditional adversarial
networks,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5967–5976,
2017.
[20] T.C. Wang, M.Y. Liu, J.Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, “Highresolution image synthe
sis and semantic manipulation with conditional gans,” in IEEE Conference on Computer Vision and
Pattern Recognition(CVPR), 2018.
[21] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, “Spectral normalization for generative adversarial
networks,” in International Conference on Learning Representations(ICLR), 2018.
[22] J. H. Lim and J. C. Ye, “Geometric gan,” arXiv:1806.03589 [stat.ML], 2017.
[23] L. A. Gatys, A. S. Ecker, and M. Bethge, “Image style transfer using convolutional neural networks,”
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2414–2423, 2016.
[24] K. Simonyan and A. Zisserman, “Very deep convolutional networks for largescale image recogni
tion,” in International Conference on Learning Representations(ICLR), 2015.
[25] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla,
M. Bernstein, A. C. Berg, and L. FeiFei, “Imagenet large scale visual recognition challenge,” Inter
national Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, 2015.
[26] X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, Y. Qiao, and C. C. Loy, “Esrgan: Enhanced super
resolution generative adversarial networks,” in European Conference on Computer Vision Workshops
(ECCVW), September 2018.
[27] M. S. M. Sajjadi, B. Schölkopf, and M. Hirsch, “Enhancenet: Single image superresolution through
automated texture synthesis,” p. 4501–4510, 12 2017.
[28] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980 [cs.LG],
2014.
[29] J. Canny, “A computational approach to edge detection,” IEEE Transactions on Pattern Analysis and
Machine Intelligence(TPAMI), vol. PAMI8, pp. 679–698, Nov 1986.

全文公開日期 2024/08/21 (校內網路)
全文公開日期 2024/08/21 (校外網路)
全文公開日期 2024/08/21 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文