簡易檢索 / 詳目顯示

研究生: 閻承恩
Cheng-En Yen
論文名稱: 基於條件式生成對抗網路與多目標損失函數的影像去馬賽克方法
Image Demosaicking Based on Conditional GANs and Multi-objective Loss
指導教授: 花凱龍
Kai-Lung Hua
口試委員: 郭景明
Jing-Ming Guo
鍾國亮
Kuo-Liang Chung
花凱龍
Kai-Lung Hua
陸敬互
Ching-Hu Lu
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 48
中文關鍵詞: 深度學習影像去馬賽克條件式生成對抗網路多目標損失函數
外文關鍵詞: deep learning, image demosaicking, conditional generative adversarial networks, multi-objective loss
相關次數: 點閱:209下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 影像去馬賽克方法是影像處理領域中的一個重要議題,這個議題是致力於將覆有彩色濾鏡陣列的感光元件所輸出的不完整彩色影像(馬賽克影像)重建成全彩影像。在這篇論文中,我們基於多階段細化方法和條件式生成對抗網路提出了一種影像去馬賽克方法的新框架。我們使用多重目標損失函數來計算學習的梯度,其中包含像素損失、感知損失以及條件式生成對抗損失。我們也藉由引入條件式鑑別器來大幅度提升去馬賽克後的影像品質。當我們將生成的影像輸入原始的鑑別器,它會根據影像的真實程度來評分,看起來越真實的影像得分越高,但是這在我們的訓練中會引起模型崩潰的問題,因為它導致生成器只會將所有的輸入影像映射成少部分看起來逼真的影像,造成輸入和輸出的影像可能會有巨大的差異。我們的條件式鑑別器不只可以評估輸出影像的真實程度,也會考慮輸入影像和去馬賽克後的影像之間的相似程度,藉此降低模型崩潰的可能性。我們的方法成功重建了高品質的全彩影像,而且影像的品質高於其他最先進的深度學習去馬賽克方法。


    One crucial step of digital camera pipelines is image demosaicking. It aims to reproduce the full color image from an incomplete color image acquired by single-chip image sensor overlaid with a color filter array. We introduce a new two stages framework for image demosaicking via a generative adversarial networks and a multi-objective loss. We use a multi-objective loss including least square conditional loss, pixel loss, and a perceptual loss to compute the stochastic gradient for our model. We use a conditional discriminator and a conditional generator which significantly enhance the performance of demosaicking. The original discriminator will give a high score as long as the input image looks real. This brings about mode collapse in our model since the generator only maps all input images into few sets of realistic images. Our conditional discriminator prevents the problem of mode collapse since it not only predicts real or fake but also considers the similarity between the input image and the demosaicked image. Our conditional generator produces a demosaicked image according to the input image so as to restrict the variance of low frequency between input image and demosaicked image. Experiments demonstrate that our approach is able to reproduce higher quality color images as compared to other state-of-the-art deep learning based demosaicking methods.

    Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Tables . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . xi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 Related Works . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3 3 Proposed methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1 Pre-demosaicking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3.2 Adversarial Model for refinement . . . . . . . . . . . . . . . . . . . . . 5 3.3 Loss function . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 8 3.4 Network architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.5 Pixel Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Ablation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.2.1 Varying the Loss Function . . . . . . . . . . . . . . . . . . . . . . . 16 4.2.2 Varying the network architecture . . . . . . . . . . . . . . . . . . . 17 4.3 Comparison with other methods . . . . . . . . . . . . . . . . . . . . . . 19 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    [1] E. Dubois, “Frequency-domain methods for demosaicking of bayersampled color images,” IEEE Signal Processing Letters, vol. 12, pp. 847–850, Dec. 2005.
    [2] L. Zhang, X. Wu, A. Buades, and X. Li, “Color demosaicking by local directional interpolation and nonlocal adaptive thresholding,” J. Electronic Imaging, vol. 20, no. 2, p. 023016, 2011.
    [3] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
    [4] B. Bayer, “Color imaging array,” Patent US 3 971 065, 1976.
    [5] H. S. Malvar, L. He, and R. Cutler, “High-quality linear interpolation for demosaicing of bayerpatterned color images,” IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, pp. 485–488, May 17-21 2004.
    [6] I. Pekkucuksen and Y. Altunbasak, “Gradient based threshold free color filter array interpolation,” Proceedings of the International Conference on Image Processing, ICIP, September 26-29, pp. 137–140, 2010.
    [7] Y. M. Lu, M. Karzand, and M. Vetterli, “Demosaicking by alternating projections: Theory and fast one-step implementation,” IEEE Trans. Image Processing, vol. 19, no. 8, pp. 2085–2098, 2010.
    [8] Y. Wang, “A multilayer neural network for image demosaicking,” IEEE International Conference on Image Processing, ICIP, pp. 1852–1856, Oct. 27-30 2014.
    [9] J. Duran and A. Buades, “A demosaicking algorithm with adaptive inter-channel correlation,” IPOL Journal, vol. 5, pp. 311–327, 2015.
    [10] Y. Monno, D. Kiku, M. Tanaka, and M. Okutomi, “Adaptive residual interpolation for color image demosaicking,” IEEE International Conference on Image Processing, ICIP, pp. 3861–3865, Sep. 27-30 2015.
    [11] D. Kiku, Y. Monno, M. Tanaka, and M. Okutomi, “Beyond color difference: Residual interpolation for color image demosaicking,” IEEE Trans. Image Processing, vol. 25, no. 3, pp. 1288–1300, 2016.
    [12] K. Hua, S. C. Hidayati, F. He, C. Wei, and Y. F. Wang, “Context-aware joint dictionary learning for color image demosaicking,” J. Visual Communication and Image Representation, vol. 38, pp. 230–245, 2016.
    [13] J. Mairal, F. R. Bach, J. Ponce, G. Sapiro, and A. Zisserman, “Non-local sparse models for image restoration,” IEEE 12th International Conference on Computer Vision, ICCV, pp. 2272–2279, Sep. 27 - Oct. 4 2009.
    [14] J. Wu, R. Timofte, and L. J. V. Gool, “Demosaicing based on directional difference regression and efficient regression priors,” IEEE Trans. Image Processing, vol. 25, no. 8, pp. 3862–3874, 2016.
    [15] M. Gharbi, G. Chaurasia, S. Paris, and F. Durand, “Deep joint demosaicking and denoising,” ACM Trans. Graph., vol. 35, no. 6, pp. 191–191, 2016.
    [16] D. Tan, W. Chen, and K. Hua, “Deepdemosaicking: Adaptive image demosaicking via multiple deep fully convolutional networks,” IEEE Trans. Image Processing, vol. 27, no. 5, pp. 2408–2419, 2018.
    [17] L. Shao, R. Yan, X. Li, and Y. Liu, “From heuristic optimization to dictionary learning: A review and comprehensive comparison of image denoising algorithms,” IEEE Transactions on Cybernetics, vol. 44, no. 7, pp. 1001–1013, 2014.
    [18] Z. Liu, S. Xu, C. P. Chen, Y. Zhang, X. Chen, and Y. Wang, “A three-domain fuzzy support vector regression for image denoising and experimental studies,” IEEE transactions on cybernetics, vol. 44, no. 4, pp. 516–525, 2014.
    [19] L. Liu, L. Chen, C. P. Chen, Y. Y. Tang, et al., “Weighted joint sparse representation for removing mixed noise in image,” IEEE transactions on cybernetics, vol. 47, no. 3, pp. 600–611, 2017.
    [20] C.-C. Yang, S.-M. Guo, and J. S.-H. Tsai, “Evolutionary fuzzy block-matching-based camera raw image denoising,” IEEE transactions on cybernetics, vol. 47, no. 9, pp. 2862–2871, 2017.
    [21] K. Zeng, J. Yu, R. Wang, C. Li, and D. Tao, “Coupled deep autoencoder for single image superresolution,” IEEE transactions on cybernetics, vol. 47, no. 1, pp. 27–37, 2017.
    [22] X. Lu, Y. Yuan, and P. Yan, “Alternatively constrained dictionary learning for image superresolution,” IEEE transactions on cybernetics, vol. 44, no. 3, pp. 366–377, 2014.
    [23] Y. Tang and Y. Yuan, “Learning from errors in super-resolution,” IEEE transactions on cybernetics, vol. 44, no. 11, pp. 2143–2154, 2014.
    [24] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” International Conference on Learning Representations, ICLR, 2016.
    [25] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, “Unpaired Image-to-Image Translation using CycleConsistent Adversarial Networks,” arXiv:1703.10593 [cs.CV], 2017.
    [26] X. Mao, Q. Li, H. Xie, R. Y. K. Lau, Z. Wang, and S. P. Smolley, “Least Squares Generative Adversarial Networks,” arXiv:1611.04076 [cs.CV], 2016.
    [27] A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey, “Adversarial autoencoders,” arXiv preprint arXiv:1511.05644, 2015.
    [28] H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss Functions for Neural Networks for Image Processing,” arXiv:1511.08861 [cs.CV], 2015.
    [29] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” arXiv:1505.04597 [cs.CV], 2015.
    [30] B. Lim, S. Son, H. Kim, S. Nah, and K. M. Lee, “Enhanced deep residual networks for single image super-resolution,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017.
    [31] B. Xu, N. Wang, T. Chen, and M. Li, “Empirical Evaluation of Rectified Activations in Convolutional Network,” arXiv:1505.00853 [cs.LG], 2015.
    [32] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image super-resolution via sparse representation,” IEEE Trans. Image Processing, vol. 19, no. 11, pp. 2861–2873, 2010.
    [33] S. Andriani, H. Brendel, T. Seybold, and J. Goldstone, “Beyond the kodak image set: A new reference set of color image sequences,” IEEE International Conference on Image Processing, ICIP, Melbourne, Australia, pp. 2289–2293, Sep. 15-18 2013.
    [34] D. Dai, R. Timofte, and L. J. V. Gool, “Jointly optimized regressors for image super-resolution,” Comput. Graph. Forum, vol. 34, no. 2, pp. 95–104, 2015.
    [35] Arbelaez, Pablo, M. Maire, C. Fowlkes, and J. Malik, “Contour detection and hierarchical image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 5, pp. 898–916, 2011.
    [36] C. Dong, C. C. Loy, and X. Tang, “Accelerating the super-resolution convolutional neural network,” Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, Proceedings, pp. 391–407, Oct. 11-14 2016.
    [37] K. Ma, Z. Duanmu, Q. Wu, Z. Wang, H. Yong, H. Li, and L. Zhang, “Waterloo Exploration Database: New challenges for image quality assessment models,” IEEE Transactions on Image Processing, vol. 26, pp. 1004–1016, Feb. 2017.
    [38] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv:1412.6980 [cs.LG], 2014.
    [39] A. Odena, C. Olah, and J. Shlens, “Conditional Image Synthesis With Auxiliary Classifier GANs,” arXiv:1610.09585 [stat.ML], 2016.
    [40] F. Chollet et al., “Keras,” 2015. GitHub: https://github.com/fchollet/keras

    QR CODE