Basic Search / Detailed Display

Author: 陳奕宇
Yi-Yu Chen
Thesis Title: 殘差的殘差密集網路結合新局部隱性圖片表示函式用於圖片任意倍率超解析度之研究
A Study of Residual in Residual Dense Networks with New Local Implicit Image Function for Arbitrary-Scale Image Super Resolution
Advisor: 吳怡樂
Yi-Leh Wu
Committee: 唐政元
陳建中
閻立剛
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2023
Graduation Academic Year: 111
Language: 英文
Pages: 36
Keywords (in Chinese): 卷積神經網路隱式神經表示位置編碼任意倍率超解析度
Keywords (in other languages): Convolutional Neural Networks, Implicit Neural Representation, Positional Encoding, Arbitrary-scale Super Resolution
Reference times: Clicks: 185Downloads: 4
Share:
School Collection Retrieve National Library Collection Retrieve Error Report
  • 近年來隨著深度卷積神經網路的發展,圖像超分辨率技術上有了顯著的進步。然而大多數研究者皆專注於讓模型順著單一的倍率進行訓練。只有少數人才專注於製作一個能通用於各式各樣倍率的模型。

    我們參考了前人提出的單一模型任意解析度之研究,並做出改良。參考之模型為一自動編碼器結構。編碼器用於取得圖片內隱藏存在的特徵,而解碼器則用於將得來的特徵還原成各種使用者所指定之解析度之圖片。在本文當中,我們找來了一個能取得更加豐富特徵的編碼器,以及一個能更詳細還原圖片之解碼器。經過實驗證明,本文提出之模型 RRDN-NLIIF(殘差的殘差密集網路結合新局部隱性圖片表示函式之簡寫)在評估指標PSNR上有比原模型更為良好的結果。


    With the development of deep convolutional neural networks, the technic of Image super resolution makes a remarkable progress. While most researchers focus on training one model with one kind of scale, only some focus on making a model that can be adapted to any scale.

    We refer to prior work on a single model arbitrary resolution and improve on it. The model we refer to is an auto-encoder structure. The encoder is used to get the feature maps of the image we input, and the decoder is used to reconstruct these feature maps to any resolution the user is specific to. With the experiment results, we show that our model RRDN-NLIIF, which is short of Residual in Residual Dense Networks with New Local Implicit Image Function, makes a better performance than the one we refer to on PSNR metric.

    Contents 論文摘要 i Abstract ii Contents iii LIST OF FIGURES iv LIST OF TABLES v Chapter 1. Introduction 1 1.1 Research background 1 1.2 Research motivation 2 Chapter 2. Related Work 3 2.1 Single scale super resolution 3 2.2 Arbitrary-scale super resolution 4 2.3 Positional encoding in super resolution 4 Chapter 3. Proposed Method 5 3.1 Baseline RDN-LIIF 5 3.1.1 Details of encoder 5 3.1.2 Details of RDB 6 3.1.3 Details of decoder 7 3.2 Improved model RRDN-NLIIF 10 3.2.1 Details of encoder 10 3.2.2 Details of RRDB and DB 11 3.2.3 Details of decoder 12 Chapter 4. Experiments 14 4.1Learning and comparison 14 Set up 14 Result 15 4.2 Ablation study 18 Result 19 4.3 Learning with size-varied ground-truths 23 Set up 23 Result 24 Chapter 5. Conclusions and Future Work 25 References 26

    References
    [1] X. Hu, H. Mu, X. Zhang, Z. Wang, T. Tan, and J. Sun, "Meta-SR: A magnification-arbitrary network for super-resolution," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1575-1584. 2019
    [2] Y. Chen, S. Liu, and X. Wang, "Learning continuous image representation with local implicit image function," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 8628-8638. 2021
    [3] J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, "Deepsdf: Learning continuous signed distance functions for shape representation," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 165-174. 2019
    [4] L. Mescheder, M. Oechsle, M. Niemeyer, S. Nowozin, and A. Geiger, "Occupancy networks: Learning 3d reconstruction in function space," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4460-4470. 2019
    [5] Z. Chen and H. Zhang, "Learning implicit fields for generative shape modeling," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5939-5948. 2019
    [6] S. Saito, Z. Huang, R. Natsume, S. Morishima, A. Kanazawa, and H. Li, "Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2304-2314. 2019
    [7] C. Jiang, A. Sud, A. Makadia, J. Huang, M. Nießner, and T. Funkhouser, "Local implicit grid representations for 3d scenes," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6001-6010. 2020
    [8] V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, "Implicit neural representations with periodic activation functions," Advances in Neural Information Processing Systems, vol. 33, pp. 7462-7473, 2020.
    [9] Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, "Residual dense network for image super-resolution," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2472-2481. 2018
    [10] X. Wang et al., "Esrgan: Enhanced super-resolution generative adversarial networks," in Proceedings of the European conference on computer vision (ECCV) workshops, pp. 0-0. 2018
    [11] X. Xu, Z. Wang, and H. Shi, "Ultrasr: Spatial encoding is a missing key for implicit image function-based arbitrary-scale super-resolution," arXiv preprint arXiv:2103.12716, 2021.
    [12] C. Dong, C. C. Loy, K. He, and X. Tang, "Image super-resolution using deep convolutional networks," IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 2, pp. 295-307, 2015.
    [13] J. Kim, J. K. Lee, and K. M. Lee, "Accurate image super-resolution using very deep convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1646-1654. 2016
    [14] J. Kim, J. K. Lee, and K. M. Lee, "Deeply-recursive convolutional network for image super-resolution," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1637-1645. 2016
    [15] C. Ledig et al., "Photo-realistic single image super-resolution using a generative adversarial network," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4681-4690. 2017
    [16] Y. Blau, R. Mechrez, R. Timofte, T. Michaeli, and L. Zelnik-Manor, "The 2018 PIRM challenge on perceptual image super-resolution," in Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 0-0. 2018
    [17] B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, "Enhanced deep residual networks for single image super-resolution," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp. 136-144. 2017
    [18] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
    [19] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, "Nerf: Representing scenes as neural radiance fields for view synthesis," Communications of the ACM, vol. 65, no. 1, pp. 99-106, 2021.
    [20] L. Liu, J. Gu, K. Zaw Lin, T.-S. Chua, and C. Theobalt, "Neural sparse voxel fields," Advances in Neural Information Processing Systems, vol. 33, pp. 15651-15663, 2020.
    [21] K. Zhang, G. Riegler, N. Snavely, and V. Koltun, "Nerf++: Analyzing and improving neural radiance fields," arXiv preprint arXiv:2010.07492, 2020.
    [22] K. Schwarz, Y. Liao, M. Niemeyer, and A. Geiger, "Graf: Generative radiance fields for 3d-aware image synthesis," Advances in Neural Information Processing Systems, vol. 33, pp. 20154-20166, 2020.
    [23] S. Wizadwongsa, P. Phongthawee, J. Yenphraphai, and S. Suwajanakorn, "Nex: Real-time view synthesis with neural basis expansion," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8534-8543. 2021
    [24] E. Agustsson and R. Timofte, "NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study," in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1122-1131, doi: 10.1109/CVPRW.2017.150. 21-26 July 2017 2017
    [25] M. Bevilacqua, A. Roumy, C. Guillemot, and M.-L. Alberi Morel, "Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding," in British Machine Vision Conference (BMVC), Guildford, Surrey, United Kingdom, 2012. 2012-09 2012
    [26] R. Zeyde, M. Elad, and M. Protter, On Single Image Scale-Up Using Sparse-Representations. 2010, pp. 711-730.
    [27] D. Martin, C. Fowlkes, D. Tal, and J. Malik, "A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics," in Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, vol. 2, pp. 416-423 vol.2, doi: 10.1109/ICCV.2001.937655. 7-14 July 2001 2001
    [28] J. B. Huang, A. Singh, and N. Ahuja, "Single image super-resolution from transformed self-exemplars," in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5197-5206, doi: 10.1109/CVPR.2015.7299156. 7-12 June 2015 2015
    [29] T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive growing of gans for improved quality, stability, and variation," arXiv preprint arXiv:1710.10196, 2017.
    [30] Z. Liu, P. Luo, X. Wang, and X. Tang, "Deep learning face attributes in the wild," in Proceedings of the IEEE international conference on computer vision, pp. 3730-3738. 2015
    [31] X. Chen, X. Wang, J. Zhou, and C. Dong, "Activating More Pixels in Image Super-Resolution Transformer," arXiv preprint arXiv:2205.04437, 2022.

    QR CODE