簡易檢索 / 詳目顯示

研究生: Della Fitrayani Budiono
Della Fitrayani Budiono
論文名稱: 基於深度網路使用注意力機制之影像還原技術
Image Restoration using Deep Networks with Attention Module
指導教授: 郭景明
Jing-Ming Guo
口試委員: 郭景明
Jing-Ming Guo
周瑞生
Jui-Sheng Chou
丁建均
Jian-Jiun Ding
徐繼聖
Gee-Sern Hsu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 113
中文關鍵詞: image restorationimage reconstructionimage denoisingdeep learningattention module
外文關鍵詞: image restoration, image reconstruction, image denoising, deep learning, attention module
相關次數: 點閱:262下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

Image restoration is a popular research topic that predicts the original
or clean image from a corrupt or noisy image. For its applications,
it involves the techniques of image reconstruction and image denoising.
Promisingly, the recent advances of deep learning can be applied to facilitate
the development in the fields of image reconstruction and image
denoising. In this research (thesis), two problems are addressed: image
reconstruction from ODBTC (an image compression technique) and blind
image denoising.
For image reconstruction from ODBTC, Self-Attention Generative
Adversarial Network (SAGAN) is proposed to reconstruct the image encoded
by ODBTC to its original appearance as realistic as possible. This
proposed method is the first deep learning based approach to perform the
ODBTC image reconstruction. Leveraged by the outstanding generative
ability of GAN, the information from the original image can be recovered
successfully by the proposed network. Besides, with the self-attention
mechanism, the network could retain more spatial details. Moreover, VGG19
perceptual loss is further employed in the network to enhance the perceptual
quality. From extensive experiments, the results show that the proposed
method outperform the previous ODBTC image reconstruction methods,
both in the evaluations of PSNR metric and visualization quality.
For blind image denoising, the Dense in Dense Network with Attention
module (DiDNT) is designed to restore the corrupt image without any
other references. For the proposed network, deeper layers are stacked to
the network to capture sufficient features, and the dense connectivity is also
iii
applied to prevent the network from collapsing due to the deeper structure.
Furthermore, the attention module is employed in the network to highlight
the features’ importance. The experimental results show that the designed
network can achieve better performance than the previously proposed blind
image denoisers and can work for different types of noisy images.
In conclusion, the proposed methods using deep learning approaches
in both image reconstruction and image denoising achieve state-of-the-art
performance. The experimental results suggest that the designed modules
in the proposed deep networks can really solve the real world problems in
the fields of image restoration.


Image restoration is a popular research topic that predicts the original
or clean image from a corrupt or noisy image. For its applications,
it involves the techniques of image reconstruction and image denoising.
Promisingly, the recent advances of deep learning can be applied to facilitate
the development in the fields of image reconstruction and image
denoising. In this research (thesis), two problems are addressed: image
reconstruction from ODBTC (an image compression technique) and blind
image denoising.
For image reconstruction from ODBTC, Self-Attention Generative
Adversarial Network (SAGAN) is proposed to reconstruct the image encoded
by ODBTC to its original appearance as realistic as possible. This
proposed method is the first deep learning based approach to perform the
ODBTC image reconstruction. Leveraged by the outstanding generative
ability of GAN, the information from the original image can be recovered
successfully by the proposed network. Besides, with the self-attention
mechanism, the network could retain more spatial details. Moreover, VGG19
perceptual loss is further employed in the network to enhance the perceptual
quality. From extensive experiments, the results show that the proposed
method outperform the previous ODBTC image reconstruction methods,
both in the evaluations of PSNR metric and visualization quality.
For blind image denoising, the Dense in Dense Network with Attention
module (DiDNT) is designed to restore the corrupt image without any
other references. For the proposed network, deeper layers are stacked to
the network to capture sufficient features, and the dense connectivity is also
iii
applied to prevent the network from collapsing due to the deeper structure.
Furthermore, the attention module is employed in the network to highlight
the features’ importance. The experimental results show that the designed
network can achieve better performance than the previously proposed blind
image denoisers and can work for different types of noisy images.
In conclusion, the proposed methods using deep learning approaches
in both image reconstruction and image denoising achieve state-of-the-art
performance. The experimental results suggest that the designed modules
in the proposed deep networks can really solve the real world problems in
the fields of image restoration.

Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Low-level vision . . . . . . . . . . . . . . . . . . . . . . 1 1.2 ODBTC Image Reconstruction . . . . . . . . . . . . . . . 3 1.3 Image denoising . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Convolutional Neural Networks (CNN) . . . . . . . . . . 8 2.1.1 Basic Components of CNN . . . . . . . . . . . . . 8 2.1.2 Residual Network (ResNet) . . . . . . . . . . . . 19 vi 2.1.3 Dense Network (DenseNet) . . . . . . . . . . . . 21 2.1.4 U-Net . . . . . . . . . . . . . . . . . . . . . . . . 23 2.1.5 Generative Adversarial Network (GAN) . . . . . . 25 2.1.6 VGG19 Perceptual Loss . . . . . . . . . . . . . . 28 2.1.7 Attention Mechanism . . . . . . . . . . . . . . . . 29 2.2 Image Compression using Ordered Dithering Block Truncation Coding (ODBTC) . . . . . . . . . . . . . . . . . . 33 2.3 Image denoising . . . . . . . . . . . . . . . . . . . . . . . 34 2.3.1 Conventional non CNN-based denoisers . . . . . . 34 2.3.2 CNN-based denoisers . . . . . . . . . . . . . . . . 36 2.4 Blind Image Denoising . . . . . . . . . . . . . . . . . . . 39 2.4.1 Noise2Noise (N2N) . . . . . . . . . . . . . . . . 40 2.4.2 Noise2Void (N2V) . . . . . . . . . . . . . . . . . 41 2.4.3 Self-supervised Bayesian Denoising with Blind-spot Networks . . . . . . . . . . . . . . . . . . . . . . 43 2.5 Standard Image Quality Assessment . . . . . . . . . . . . 53 3 ODBTC Reconstruction Proposed Methods . . . . . . . . . . . 54 4 Blind Image Denoising Proposed Methods . . . . . . . . . . . . 59 5 ODBTC Reconstruction Experimental Result . . . . . . . . . . 65 5.1 ODBTC Image Reconstruction on Grayscale Images . . . 66 5.2 ODBTC Image Reconstruction on Color Images . . . . . . 69 6 Blind Image Denoising Experimental Result . . . . . . . . . . . 76 6.1 Denoising on Additive White Gaussian Noise (AWGN) . . 77 6.2 Denoising on Poisson Noise . . . . . . . . . . . . . . . . 80 6.3 Denoising on Impulse Noise . . . . . . . . . . . . . . . . 82 6.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . 85 7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 7.1 Conclusion for ODBTC Image Reconstruction . . . . . . . 90 7.2 Future Works for ODBTC Image Reconstruction . . . . . 91 7.3 Conclusion for Blind Image Denoising . . . . . . . . . . . 91 7.4 Future Works for Blind Image Denoising . . . . . . . . . . 92 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

[1] J.-M. Guo and M.-F. Wu, “Improved block truncation coding based on the void-and-cluster dithering
approach,” IEEE Transactions on image processing, vol. 18, no. 1, pp. 211–213, 2008.
[2] J.-M. Guo, “High efficiency ordered dither block truncation coding with dither array lut and its scalable
coding application,” Digital Signal Processing, vol. 20, no. 1, pp. 97–110, 2010.
[3] J.-M. Guo, M.-F. Wu, and Y.-C. Kang, “Watermarking in conjugate ordered dither block truncation
coding images,” in 2009 IEEE International Symposium on Circuits and Systems, pp. 1077–1080,
IEEE, 2009.
[4] J.-M. Guo, H. Prasetyo, and H.-S. Su, “Image indexing using the color and bit pattern feature fusion,”
Journal of Visual Communication and Image Representation, vol. 24, no. 8, pp. 1360–1379, 2013.
[5] J.-M. Guo, H. Prasetyo, H. Lee, and C.-C. Yao, “Image retrieval using indexed histogram of voidand-cluster block truncation coding,” Signal Processing, vol. 123, pp. 143–156, 2016.
[6] H. Prasetyo and D. Riyono, “Fast vector quantization for odbtc image reconstruction,” in 2017 International Conference on Computer, Control, Informatics and its Applications (IC3INA), pp. 75–79,
IEEE, 2017.
[7] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising by sparse 3-d transform-domain
collaborative filtering,” IEEE Transactions on image processing, vol. 16, no. 8, pp. 2080–2095, 2007.
[8] M. Makitalo and A. Foi, “Optimal inversion of the anscombe transformation in low-count poisson
image denoising,” IEEE transactions on Image Processing, vol. 20, no. 1, pp. 99–109, 2010.
[9] A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image denoising,” in 2005 IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2, pp. 60–
65, IEEE, 2005.
[10] V. Jain and S. Seung, “Natural image denoising with convolutional networks,” in Advances in neural
information processing systems, pp. 769–776, 2009.
[11] J. Chen, J. Chen, H. Chao, and M. Yang, “Image blind denoising with generative adversarial network based noise modeling,” in Proceedings of the IEEE Conference on Computer Vision and Pattern
Recognition, pp. 3155–3164, 2018.
[12] Y. Tai, J. Yang, X. Liu, and C. Xu, “Memnet: A persistent memory network for image restoration,” in
Proceedings of the IEEE international conference on computer vision, pp. 4539–4547, 2017.
[13] J. Lehtinen, J. Munkberg, J. Hasselgren, S. Laine, T. Karras, M. Aittala, and T. Aila, “Noise2noise:
Learning image restoration without clean data,” arXiv preprint arXiv:1803.04189, 2018.
[14] S. Guo, Z. Yan, K. Zhang, W. Zuo, and L. Zhang, “Toward convolutional blind denoising of real
photographs,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
pp. 1712–1722, 2019.
[15] S. Anwar and N. Barnes, “Real image denoising with feature attention,” in Proceedings of the IEEE
International Conference on Computer Vision, pp. 3155–3164, 2019.
[16] A. Krull, T.-O. Buchholz, and F. Jug, “Noise2void-learning denoising from single noisy images,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2129–2137,
2019.
[17] S. Laine, T. Karras, J. Lehtinen, and T. Aila, “High-quality self-supervised deep image denoising,” in
Advances in Neural Information Processing Systems, pp. 6968–6978, 2019.
[18] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–
4708, 2017.
[19] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural computation, vol. 1, no. 4, pp. 541–
551, 1989.
[20] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016. http://www.
deeplearningbook.org.
[21] Y.-T. Zhou and R. Chellappa, “Computation of optical flow using a neural network,” in IEEE International Conference on Neural Networks, vol. 1998, pp. 71–78, 1988.
[22] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.
[23] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Instance normalization: The missing ingredient for fast
stylization,” arXiv preprint arXiv:1607.08022, 2016.
[24] D. Ulyanov, V. Lebedev, A. Vedaldi, and V. S. Lempitsky, “Texture networks: Feed-forward synthesis
of textures and stylized images.,” in ICML, vol. 1, p. 4, 2016.
[25] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and superresolution,” in European conference on computer vision, pp. 694–711, Springer, 2016.
[26] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings
of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
[27] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, pp. 234–241, Springer, 2015.
[28] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,”
in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440,
2015.
[29] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and
Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems,
pp. 2672–2680, 2014.
[30] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, “Image-to-image translation with conditional adversarial networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition,
pp. 1125–1134, 2017.
[31] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784,
2014.
[32] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,”
arXiv preprint arXiv:1805.08318, 2018.
[33] A. Odena, J. Buckman, C. Olsson, T. Brown, C. Olah, C. Raffel, and I. Goodfellow, “Is generator
conditioning causally related to GAN performance?,” in Proceedings of the 35th International Conference on Machine Learning (J. Dy and A. Krause, eds.), vol. 80 of Proceedings of Machine Learning
Research, (Stockholmsmässan, Stockholm Sweden), pp. 3849–3858, PMLR, 10–15 Jul 2018.
[34] T. Miyato and M. Koyama, “cGANs with projection discriminator,” in International Conference on
Learning Representations, 2018.
[35] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE conference
on computer vision and pattern recognition, pp. 7132–7141, 2018.
[36] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. Zemel, and Y. Bengio, “Show,
attend and tell: Neural image caption generation with visual attention,” in International conference
on machine learning, pp. 2048–2057, 2015.
[37] L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, and T.-S. Chua, “Sca-cnn: Spatial and channelwise attention in convolutional networks for image captioning,” in Proceedings of the IEEE conference
on computer vision and pattern recognition, pp. 5659–5667, 2017.
[38] F. Wang, M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, X. Wang, and X. Tang, “Residual attention
network for image classification,” in Proceedings of the IEEE Conference on Computer Vision and
Pattern Recognition, pp. 3156–3164, 2017.
[39] X. Wang, R. Girshick, A. Gupta, and K. He, “Non-local neural networks,” in Proceedings of the IEEE
conference on computer vision and pattern recognition, pp. 7794–7803, 2018.
[40] S. Roth and M. J. Black, “Fields of experts: A framework for learning image priors,” in 2005
IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2,
pp. 860–867, IEEE, 2005.
[41] A. Barbu, “Training an active random field for real-time image denoising,” IEEE Transactions on
Image Processing, vol. 18, no. 11, pp. 2451–2462, 2009.
[42] J. Sun and M. F. Tappen, “Learning non-local range markov random field for image restoration,” in
CVPR 2011, pp. 2745–2752, IEEE, 2011.
[43] K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image denoising with block-matching and 3d
filtering,” in Image Processing: Algorithms and Systems, Neural Networks, and Machine Learning,
vol. 6064, p. 606414, International Society for Optics and Photonics, 2006.
[44] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned
dictionaries,” IEEE Transactions on Image processing, vol. 15, no. 12, pp. 3736–3745, 2006.
[45] M. Zontak and M. Irani, “Internal statistics of a single natural image,” in CVPR 2011, pp. 977–984,
IEEE, 2011.
[46] A. Levin and B. Nadler, “Natural image denoising: Optimality and inherent bounds,” in CVPR 2011,
pp. 2833–2840, IEEE, 2011.
[47] H. C. Burger, C. J. Schuler, and S. Harmeling, “Image denoising: Can plain neural networks compete
with bm3d?,” in 2012 IEEE conference on computer vision and pattern recognition, pp. 2392–2399,
IEEE, 2012.
[48] J. Xie, L. Xu, and E. Chen, “Image denoising and inpainting with deep neural networks,” in Advances
in neural information processing systems, pp. 341–349, 2012.
[49] F. Agostinelli, M. R. Anderson, and H. Lee, “Adaptive multi-column deep neural networks with application to robust image denoising,” in Advances in Neural Information Processing Systems, pp. 1493–
1501, 2013.
[50] R. Vemulapalli, O. Tuzel, and M.-Y. Liu, “Deep gaussian conditional random field network: A modelbased deep network for discriminative denoising,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4801–4809, 2016.
[51] V. Santhanam, V. I. Morariu, and L. S. Davis, “Generalized deep image to image regression,” in
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5609–5619,
2017.
[52] K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a gaussian denoiser: Residual learning
of deep cnn for image denoising,” IEEE Transactions on Image Processing, vol. 26, no. 7, pp. 3142–
3155, 2017.
[53] S. Lefkimmiatis, “Non-local color image denoising with convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3587–3596, 2017.
[54] K. Zhang, W. Zuo, and L. Zhang, “Ffdnet: Toward a fast and flexible solution for cnn-based image
denoising,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4608–4622, 2018.
[55] D. Ulyanov, A. Vedaldi, and V. Lempitsky, “Deep image prior,” in Proceedings of the IEEE Conference
on Computer Vision and Pattern Recognition, pp. 9446–9454, 2018.
[56] S. Lefkimmiatis, “Universal denoising networks: a novel cnn architecture for image denoising,” in
Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3204–3213,
2018.
[57] S. Nam, Y. Hwang, Y. Matsushita, and S. Joo Kim, “A holistic approach to cross-channel image noise
modeling and its application to image denoising,” in Proceedings of the IEEE conference on computer
vision and pattern recognition, pp. 1683–1691, 2016.
[58] X. Mao, C. Shen, and Y.-B. Yang, “Image restoration using very deep convolutional encoder-decoder
networks with symmetric skip connections,” in Advances in neural information processing systems,
pp. 2802–2810, 2016.
[59] M. Weigert, U. Schmidt, T. Boothe, A. Müller, A. Dibrov, A. Jain, B. Wilhelm, D. Schmidt, C. Broaddus, S. Culley, et al., “Content-aware image restoration: pushing the limits of fluorescence microscopy,” Nature methods, vol. 15, no. 12, pp. 1090–1097, 2018.
[60] A. Van den Oord, N. Kalchbrenner, L. Espeholt, O. Vinyals, A. Graves, et al., “Conditional image
generation with pixelcnn decoders,” in Advances in neural information processing systems, pp. 4790–
4798, 2016.
[61] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error
visibility to structural similarity,” IEEE transactions on image processing, vol. 13, no. 4, pp. 600–
612, 2004.

無法下載圖示 全文公開日期 2025/08/18 (校內網路)
全文公開日期 2025/08/18 (校外網路)
全文公開日期 2025/08/18 (國家圖書館:臺灣博碩士論文系統)
QR CODE