簡易檢索 / 詳目顯示

研究生: 楊坤澄
Kun-Cheng Yang
論文名稱: 基於深度殘差網絡和共變異池化層融合的超解析度影 像重構研究
Image Super-Resolution Reconstruction Based on Fusion of Deep Residual Network and Covariation Pooling Layer
指導教授: 楊振雄
Chen-Hsiung Yang
口試委員: 徐勝均
Sendren Sheng-Dong Xu
陳金聖
Chin-Sheng Chen
吳常熙
Chang-Hsi Wu
楊振雄
Chen-Hsiung Yang
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 95
中文關鍵詞: 色彩模型通道注意力機制深度學習超分辨率
外文關鍵詞: Color Model, Channel Attention, Deep Learning, Super Resolution
相關次數: 點閱:245下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來因為深度學習的日漸熱門,許多電腦視覺的研究日漸蓬勃發展,尤其是在影像辨識跟影像處理這塊,除了也歸因於硬體攝影設備的品質更加優良,光學設備的精準,使得影像更加清晰,但有時候會因為許多外在或人為因素使得影像模糊不清,手動想要去放大也只是無功而返,因次在此刻就需要影像超分辨率(Image Super Resolution)來解決此問題。
    本文透過深度殘差學習的方式,來使得影像放大時在填補像素的技術能夠更加精確,使得倍率得以放大最多能達到八倍,我們也引用了注意力機制裡面的通道域以及空間域的方式,來確保對於影像的位置以及關注色彩的通道與通道之間能有更精良更確切的改善。
    此外除了使用一般的二維卷積來提取特徵得到特徵圖,也在過程中使用了三維立體維度的卷積,使得空間上的資訊對應關係有更強大的連結,最後在結論上我們有改善了過去傳統方法針對被放大影像能有更銳利的影像輸出,其視覺效果更接近我們人眼所認定的清晰影像,而不是只得到一張很平滑的模糊影像。


    In recent years, due to the increasing popularity of deep learning, many researches about computer vision field have been flourishing. Especially in image recognition and image processing. In addition to the better quality of hardware photography equipment and the accuracy of optical equipment, images are getting better and clearer than before. However, the image will be blurred due to external or human factors. Although we are tring to zoom in is just in vain. Therefore, image super resolution is really important at this moment to solve this problem.
    In this thesis, we used the method of deep residual learning, the skill of filling pixels when the images is enlarged can be more accurate. The scale can be enlarged up to eight times larger. We also refer to the channel domain with spatial domain in attention mechanism. In order to ensure that the position of the images and channels that focus on color can be more refined and accurate.
    Besides, in addition to using general two-dimensional convolution to extract features and then obtain feature maps, we also use three-dimensional convolution in the process to make the spatial information correspondence more powerful. Finally, we have improved the result. Compared to the traditional methods, we can get sharper image output for the enlarged image. And its visual effect is closer to the clear image recognized by our human eyes, rather than only getting a very smooth and blurred image.

    摘要 I Abstract IV 致謝 V 目錄 VI 圖目錄 VIII 表目錄 X 第一章 緒論 1 1.1 前言 1 1.2 文獻回顧 2 1.3 研究動機 5 1.4 大綱 6 第二章 超解析度影像品質提升演算法 8 2.1 卷積神經網路 8 2.2 注意力機制 17 2.2.1 通道域注意力機制 21 2.2.2 空間域注意力注意力機制 24 2.2.3 混合域注意力機制 26 2.3 殘差通道注意力 30 2.4 評估指標 32 第三章 深度學習模型實現 38 3.1 深度學習框架 38 3.2 深度學習模型元件 43 3.2.1 卷積核填充 43 3.2.2 激活函數 46 3.2.3 優化器 52 3.2.4 損失函數 56 3.3 基於通道空間注意力影像超解析度之殘差結構 58 3.4 二階池化機制之引入 60 3.5 通道空間域注意力機制模塊 63 第四章 實驗結果與分析 67 4.1 實驗裝置 67 4.2 訓練測試資料集 68 4.3 經處理之影像品質分析 69 4.4 實驗數據比較 70 第五章 結論與未來展望 76 5.1 結論 78 5.2 未來展望 78 參考文獻 80

    [1] E. Agustsson and R. Timofte, "NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study," 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 1122-1131, doi: 10.1109/CVPRW.2017.150.
    [2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, pp. 1097-1105, 2012.
    [3] K. He, X. Zhang, S. Ren and J. Sun, “Deep Residual Learning for Image Recognition,” IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 770-778, 2016, 10.1109/CVPR.2016.90.
    [4] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition", Proc. IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
    [5] T. Dai, J. Cai, Y. Zhang, S.-T. Xia and L. Zhang, “Second-order attention network for single image super-resolution,” Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 11057-11066, 2019.
    [6] T. Tong, G. Li, X. Liu and Q. Gao, "Image super-resolution using dense skip connections," Proc. IEEE Int. Conf. Comput. Vis., pp. 4809-4817, 2017.
    [7] Y. Yang, C. Ma, and M.-H. Yang. Single-image super-resolution: A benchmark. In European Conference on Computer Vision, pages 372–386. Springer, 2014.
    [8] V. Nair and G. E. Hinton, “Rectified Linear Units Improve Restricted Boltzmann Machines,” ICML, 2010.
    [9] V. Thakkar, S. Tewary and C. Chakraborty, “Batch Normalization in Convolutional Neural Networks - A comparative study with CIFAR-10 data,” 2018 Fifth International Conference on Emerging Applications of Information Technology (EAIT), pp. 1-5, 2018, 10.1109/EAIT.2018.8470438.
    [10] F. Yan, S. Zhao and S. Venegas-Andraca, “Implementing bilinear interpolation with quantum images,” Digital Signal Processing, vol. 117, pp. 103149, 2021.
    [11] T. Hui, Y.L. Xu, R. Jarhinbek, “Detail texture detection based on Yolov4-tiny combined with attention mechanism and bicubic interpolation,” IET Image Process, pp. 2736-2748, 2021.
    [12] D. Bahdanau, K. Cho and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” Proc. Int. Conf. Learn. Representations, 2014.
    [13] J. Hu, L. Shen and G. Sun, "Squeeze-and-excitation networks," Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 7132-7141, 2018.
    [14] S. Woo, J. Park, J.-Y. Lee and I. S. Kweon, “CBAM: Convolutional block attention module,” Proc. Eur. Conf. Comput. Vis. (ECCV), pp. 8-14, Sep. 2018.
    [15] J Li, Y Han, M Zhang et al., “Multi-scale residual network model combined with Global Average Pooling for action recognition[J],” Multimedia Tools and Applications, pp. 1-19, 2021.
    [16] R. Zhang, X. Shu, R. Yan, J. Zhang, and Y. Song, “Skip-attention Encoder-decoder Framework for Human Motion Prediction,” Multimedia Systems, vol. 28, no. 2, pp. 413–422, 2022.
    [17] A. Onan, “Bidirectional convolutional recurrent neural network architecture with group-wise enhancement mechanism for text sentiment classification,” J. King Saud Univ.-Comput. Inf. Sci., vol. 34, no. 5, pp. 2098–2117, May 2022.
    [18] M. Jaderberg, K. Simonyan, A. Zisserman et al., “Spatial transformer networks,” Proc. NIPS, pp. 2017-2025, 2015.
    [19] F. Wang et al., “Residual attention network for image classification,” Proc. IEEE Conf. Comput. Vision Pattern Recognit., pp. 3156-3164, 2017.
    [20] S. Woo, J. Park, J. Lee and I. Kweon, “CBAM: Convolutional block attention module,” Proc. Eur. Conf. Comput. Vis., pp. 3-19, 2018.
    [21] Y. Liu, X. Li, T. Bai, K. Wang, F.Y. Wang, “Multi-object tracking with hard-soft attention network and group-based cost minimization,” Neurocomputing, pp. 80-91,2021.
    [22] Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, “Image quality assessment: From error measurement to structural similarity,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 600-612, Apr. 2004.
    [23] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong and Y. Fu, “Image super-resolution using very deep residual channel attention networks,” Proc. Eur. Conf. Comput. Vis., pp. 294-310, 2018.
    [24] G. Huang and Z. Liu, “Densely connected convolutional networks,” Proc. Conf. Comput. Vis. Pattern Recognit., pp. 1-4, 2017.
    [25] D. Hendrycks and K. Gimpel, “Gaussian error linear units (GELUs),” in arXiv:1606.08415, 2016.
    [26] C. Dong, C. C. Loy, K. He and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 295-307, Feb. 2016.
    [27] S. Gu, W. Zuo, Q. Xie, D. Meng, X. Feng and L. Zhang, "Convolutional sparse coding for image super-resolution," Proc. IEEE Int. Conf. Comput. Vis., pp. 1823-1831, 2015.
    [28] J. Kim, J. K. Lee and K. M. Lee, "Accurate image super-resolution using very deep convolutional networks," Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 1646-1654, Jun. 2016.
    [29] J. Kim, J. Kwon Lee and K. Mu Lee, “Deeply-recursive convolutional network for image super-resolution,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 1637-1645, 2016.
    [30] M. S. Sajjadi, B. Schölkopf and M. Hirsch, “Enhancenet: Single image super-resolution through automated texture synthesis,” Proc. IEEE Int. Conf. Comput. Vis., pp. 4501-4510, 2017.
    [31] Y. Tai, J. Yang, X. Liu and C. Xu, “MemNet: A persistent memory network for image restoration,” Proc. IEEE Int. Conf. Comput. Vis., pp. 4539-4547, Aug. 2017.
    [32] B. Lim, S. Son, H. Kim, S. Nah and K. M. Lee, “Enhanced deep residual networks for single image super-resolution,” Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, pp. 1132-1140, 2017.
    [33] Y. Tai, J. Yang and X. Liu, “Image super-resolution via deep recursive residual network”, Proc. IEEE Conf. Comput. Vis. Pattern Recognit., pp. 2790-2798, 2017.
    [34] J. Liang, J. Wang, Y. Quan, T. Chen, J. Liu, H. Ling, et al., "Recurrent exposure generation for low-light face detection", arXiv:2007.10963, 2020
    [35] W. Shao et al., “ProsRegNet: A deep learning framework for registration of MRI and histopathology images of the prostate,” Medical image analysis, vol. 68, p. 101919, 2021.
    [36] C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, et al., "Photo-realistic single image super-resolution using a generative adversarial network," Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), pp. 4681-4690, Jul. 2017.
    [37] Z. Zhang, Y. Shi, X. Zhou, H. Kan and J. Wen, “Shuffle block srgan for face image super-resolution reconstruction”, Measurement and Control, vol. 53, no. 7–8, pp. 1429-1439, 2020.
    [38] J. Duchi, E. Hazan and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” J. Mach. Learn. Res., vol. 12, pp. 2121-2159, Feb. 2011.
    [39] P. Li, J. Xie, Q. Wang and Z. Gao, “Towards faster training of global covariance pooling networks by iterative matrix square root normalization,” Proc. CVPR, pp. 947-955, Jun. 2018.
    [40] S. Ji, W. Xu, M. Yang and K. Yu, "3D convolutional neural networks for human action recognition", IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 1, pp. 221-231, Jan. 2013.
    [41] T. Hassanzadeh, D. Essam and R. Sarker, “2D to 3D evolutionary deep convolutional neural networks for medical image segmentation,” IEEE Trans. Med. Imag., vol. 40, no. 2, pp. 712-721, Feb. 2021.
    [42] M. Haris, G. Shakhnarovich and N. Ukita, “Deep backp-rojection networks for super-resolution,” Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 1664-1673, 2018.
    [43] C. Tian et al., “Lightweight image super-resolution with enhanced CNN, ” Knowl. Based Syst., vol. 205, Oct. 2020.
    [44] J. -S. Kim, K. Ko and C. -S. Kim, "Gluing Reference Patches Together for Face Super-Resolution," in IEEE Access, vol. 9, pp. 169321-169334, 2021, doi: 10.1109/ACCESS.2021.3138442.
    [45] D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” Proc. Int. Conf. Learn. Represent., pp. 1-41, 2015.
    [46] C. Xu, Y. Makihara, X. Li, Y. Yagi and J. Lu, “Cross-view gait recognition using pairwise spatial transformer networks, ” IEEE Trans. Circuits Syst. Video Technol., Feb. 2020.

    無法下載圖示 全文公開日期 2025/09/19 (校內網路)
    全文公開日期 2025/09/19 (校外網路)
    全文公開日期 2025/09/19 (國家圖書館:臺灣博碩士論文系統)
    QR CODE