Basic Search / Detailed Display

Author: 謝宗倫
Tsung-Lun Hsieh
Thesis Title: 針對卷積神經網路之光場影像超解析度基於抽樣一致性校正的品質增強
Downsampling Consistency Correction-based Quality Enhancement for CNN-based Light Field Super Resolution
Advisor: 鍾國亮
Kuo-Liang Chung
Committee: 廖弘源
范國清
鮑興國
賴祐吉
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2023
Graduation Academic Year: 111
Language: 英文
Pages: 34
Keywords (in Chinese): 卷積神經網路抽樣一致性校正抽樣不一致問題光場影像品質增強超解析度
Keywords (in other languages): Convolutional neural networks, downsampling consistency correction, downsampling inconsistency problem, light field images, quality enhancement, super resolution
Reference times: Clicks: 236Downloads: 5
Share:
School Collection Retrieve National Library Collection Retrieve Error Report
  • 在過去的幾年中,許多基於卷積神經網路的光場(LF)影像超解析度(SR)方法已被開發。然而,由於低解析度(LR)測試光場影像與LR訓練光場影像之間的抽樣不一致性,它們可能會遭受品質下降的問題。為了解決這個品質下降的問題,本文提出了一種基於抽樣一致性校正的品質增強方法(DCC-based)。首先,提出了一種基於品質的投票策略來辨識訓練步驟中使用的抽樣方案。接下來,提出了一個串聯的Swin Transformer-based辨識器來識別LR測試光場影像中使用的抽樣位置和抽樣方法,然後使用所提出的DCC-based方法顯著提高放大的LF影像的品質。基於典型的LF影像數據集,進行了全面的實驗,證明了我們的方法相對於最先進的LF SR方法在品質上具有顯著的改進優點。


    In the past years, numerous CNN-based light field (LF) image super-resolution (SR) methods have been developed. However, because of the downsampling inconsistency between low resolution (LR) testing LF images and LR training LF images, they might suffer from quality degradation. To remedy this quality degradation problem, this paper proposes a downsampling consistency correction-based (DCC-based) quality enhancement method. Firstly, a quality-based voting strategy is proposed to recognize the downsampling scheme used in the training step. Next, a cascaded Swin Transformer-based recognizer is proposed to identify the downsampled position and downsampling scheme used in the LR testing LF image, and then the proposed DCC-based method is used to significantly improve the quality of the upsampled LF image. Based on typical LF image datasets, comprehensive experiments have been carried out to justify the significant quality improvement merit of our method relative to the state-of-the-art LF SR methods.

    Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . i Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Related CNN-based LF SR works . . . . . . . . . . . . . 3 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Downsampling Inconsistency-sensitive Quality Degradation Problem . . . . . . . . . . . . . . . . . . . . 10 2.1 Quantitative DIS Quality Degradation Analysis . . . . . . 11 2.2 Better Quality Performance Using 4:2:0(A) in the Training Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 Worse quality performance using 4:2:0(Direct) in the training step . . . . . . . . . . . . . . . . . . . 14 2.2.2 Better quality performance using 4:2:0(A) in the training step . . . . . . . . . . . . . . . . . . . . . 15 3 Proposed Downsampling Consistency Correction-based Quality Enhancement Method . . . . . . . . . . . . . . . . . . . . . . . 16 3.1 Identifying the Downsampling Scheme Used in the Training Step . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 The Proposed Algorithm . . . . . . . . . . . . . . . . . . 17 3.2.1 The first step for computing the probability of the downsampled position being at the center position or at a non-center position for the LR testing image ILR,testing . . . . . . . . . . . . . . . . . . . . . . 18 3.2.2 The second step for upsampling ILR,testing using CNNBICU −D or CNN4:2:0(A) . . . . . . . . . . . . 18 3.2.3 The third step for upsampling ILR,testing 4:2:0(Direct) using the proposed DCC-based approach . . . . . . . . . . . 19 4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . 23 4.1 Objective Quality Enhancement Merit of Our Method . . . 25 4.2 Subjective Quality Enhancement Merits of Our Method . 27 5 Discussions And Conclusions . . . . . . . . . . . . . . . . . . . 29 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    [1] Y. Wang, L. Wang, J. Yang, W. An, J. Yu, and Y. Guo, “Light field spatial super-resolution using deep efficient spatial-angular separable convolution,” in Proceedings of the European Conference on Computer Vision (ECCV), vol. 12368, pp. 290–308, 2020.
    [2] Y. Wang, J. Yang, L. Wang, X. Ying, T. Wu, W. An, and Y. Guo, “Light field image super-resolution using deformable convolution,” IEEE Transactions on Image Processing, vol. 30, pp. 1057–1071,2021.
    [3] G. Liu, H. Yue, J. Wu, and J. Yang, “Intra-inter view interaction network for light field image super-resolution,” IEEE Transactions on Multimedia, vol. 25, pp. 256–266, Nov. 2021.
    [4] Y. Wang, L. Wang, G. Wu, J. Yang, W. An, J. Yu, and Y. Guo, “Disentangling light fields for super-resolution and disparity estimation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 425–443, 2022.
    [5] “Lytro illum..” [Online]. Available: https://www.lytro.com/.
    [6] “Raytrix.” [Online]. Available: https://raytrix.de/.
    [7] M. Rerabek and T. Ebrahimi, “New light field image dataset,” International Conference on Quality of Multimedia Experience, 2016. [Online]. Available: https://www.epfl.ch/labs/mmspg/downloads/epfl-light-field-image-dataset/.
    [8] K. Honauer, O. Johannsen, D. Kondermann, and B. Goldluecke, “A dataset and evaluation methodology for depth estimation on 4d light fields,” Asian Conference on Computer Vision, vol. 10113,pp. 19–34, 2016.
    [9] Y. Wang, J. Yang, Y. Guo, C. Xiao, and W. An, “Selective light field refocusing for camera arrays using bokeh rendering and superresolution,” IEEE Signal Processing Letters, vol. 26, no. 1, pp. 204–208,2018.
    [10] Z. Lu, J. Wu, H. Qiao, Y. Zhou, T. Yan, Z. Zhou, X. Zhang, J. Fan, and Q. Dai, “Phase-space deconvolution for light field microscopy,” Optics Express, vol. 27, no. 13, pp. 18131–18145, 2019.
    [11] Y. Wang, T. Wu, J. Yang, L. Wang, W. An, and Y. Guo, “Deoccnet: Learning to see through foreground occlusions in light fields,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 118–127, 2020.
    [12] M. W. Tao, J.-C. Su, T.-C. Wang, J. Malik, and R. Ramamoorthi, “Depth estimation and specular removal for glossy surfaces using point and line consistency with light-field cameras,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 6, pp. 1155–1169, 2015.
    [13] T.-C. Wang, M. Chandraker, A. A. Efros, and R. Ramamoorthi, “Svbrdf-invariant shape and reflectance estimation from light-field cameras,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5451–5459, 2016.
    [14] T. Wang, Y. Piao, X. Li, L. Zhang, and H. Lu, “Deep learning for light field saliency detection,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 8838–8848, 2019.
    [15] M. Zhang, J. Li, J. Wei, Y. Piao, and H. Lu, “Memory-oriented decoder for light field salient object detection,” Advances in Neural Information Processing Systems, vol. 32, 2019.
    [16] M. Zhang, W. Ji, Y. Piao, J. Li, Y. Zhang, S. Xu, and H. Lu, “Lfnet: Light field fusion network for salient object detection,” IEEE Transactions on Image Processing, vol. 29, pp. 6276–6287, 2020.
    [17] T. E. Bishop, S. Zanetti, and P. Favaro, “Light field superresolution,” in 2009 IEEE International Conference on Computational Photography (ICCP), pp. 1–9, 2009.
    [18] K. Mitra and A. Veeraraghavan, “Light field denoising, light field superresolution and stereo camera based refocussing using a gmm light field patch prior,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 22–28, 2012.
    [19] H. Yin, S. Li, and L. Fang, “Simultaneous image fusion and super-resolution using sparse representation,” Information Fusion, vol. 14, no. 3, pp. 229–240, 2013.
    [20] S. Wanner and B. Goldluecke, “Variational light field analysis for disparity estimation and super-resolution,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 3, pp. 606–619, 2013.
    [21] R. A. Farrugia, C. Galea, and C. Guillemot, “Super resolution of light field images using linear subspace projection of patch-volumes,” IEEE Journal of Selected Topics in Signal Processing, vol. 11,no. 7, pp. 1058–1071, 2017.
    [22] M. Rossi and P. Frossard, “Geometry-consistent light field super-resolution via graph-based regularization,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4207–4218, 2018.
    [23] M. Zhao, G. Wu, Y. Li, X. Hao, L. Fang, and Y. Liu, “Cross-scale reference-based light field super-resolution,” IEEE Transactions on Computational Imaging, vol. 4, no. 3, pp. 406–418, 2018.
    [24] V. K. Ghassab and N. Bouguila, “Light field super-resolution using edge-preserved graph-based regularization,” IEEE Transactions on Multimedia, vol. 22, no. 6, pp. 1447–1457, 2019.
    [25] Y. Yoon, H.-G. Jeon, D. Yoo, J.-Y. Lee, and I. So Kweon, “Learning a deep convolutional network for light-field image super-resolution,” in Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW), pp. 24–32, 2015.
    [26] C. Dong, C. C. Loy, K. He, and X. Tang, “Learning a deep convolutional network for image super-resolution,” in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland,September 6-12, 2014, Proceedings, Part IV 13, pp. 184–199, 2014.
    [27] G. Wu, Y. Liu, L. Fang, Q. Dai, and T. Chai, “Light field reconstruction using convolutional network on epi and extended applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence,vol. 41, no. 7, pp. 1681–1694, 2018.
    [28] Y. Yuan, Z. Cao, and L. Su, “Light-field image superresolution using a combined deep cnn based on epi,” IEEE Signal Processing Letters, vol. 25, no. 9, pp. 1359–1363, 2018.
    [29] B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 136–144, 2017.
    [30] Z. Cheng, Z. Xiong, and D. Liu, “Light field super-resolution by jointly exploiting internal and external similarities,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 30, no. 8,pp. 2604–2616, 2020.
    [31] T.-C. Wang, A. A. Efros, and R. Ramamoorthi, “Occlusion-aware depth estimation using light-field cameras,” in Proceedings of the IEEE international conference on computer vision, pp. 3487–3495,2015.
    [32] S. Zhang, S. Chang, and Y. Lin, “End-to-end light field spatial super-resolution network using multiple epipolar geometry,” IEEE Transactions on Image Processing, vol. 30, pp. 5956–5968, 2021.
    [33] “Opencv:geometric image transformations.” [Online]. Available: https://docs.opencv.org/3.4/da/d54/group__imgproc__transform.html.
    [34] M. Alain and A. Smolic, “Light field super-resolution via lfbm5d sparse coding,” in IEEE International Conference on Image Processing (ICIP), pp. 2501–2505, 2018.
    [35] H. W. F. Yeung, J. Hou, X. Chen, Z. Chen, J. Chen, and Y. Y. Chung, “Light field spatial super-resolution using deep efficient spatial-angular separable convolution,” IEEE Transactions on Image Processing, vol. 28, pp. 2319–2330, 2019.
    [36] S. Zhang, Y. Lin, and H. Sheng, “Residual networks for light field image super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11046–11055, 2019.
    [37] J. Jin, J. Hou, J. Chen, and S. Kwong, “Light field spatial super-resolution via deep combinatorial geometry embedding and structural consistency regularization,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2260–2269, 2020.
    [38] H. W. F. Yeung, J. Hou, X. Chen, J. Chen, Z. Chen, and Y. Y. Chung, “Light field spatial super-resolution using deep efficient spatial-angular separable convolution,” IEEE Transactions on Image Processing, vol. 28, no. 5, pp. 2319–2330, 2018.
    [39] Y. Mo, Y. Wang, C. Xiao, J. Yang, and W. An, “Dense dual-attention network for light field image super-resolution,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 32, no. 7,pp. 4431–4443, 2022.
    [40] Y. Wang, F. Liu, K. Zhang, G. Hou, Z. Sun, and T. Tan, “Lfnet: A novel bidirectional recurrent convolutional neural network for light-field image super-resolution,” IEEE Transactions on Image Processing, vol. 27, no. 9, pp. 4274–4286, 2018.
    [41] B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 136–144, 2017.
    [42] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, “Image super-resolution using very deep residual channel attention networks,” in Proceedings of the European Conference on Computer Vision (ECCV),pp. 286–301, 2018.
    [43] T. Dai, J. Cai, Y. Zhang, S.-T. Xia, and L. Zhang, “Second-order attention network for single image super-resolution,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11065–11074, 2019.
    [44] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),pp. 4681–4690, 2017.
    [45] S. Zhang, H. Sheng, C. Li, J. Zhang, and Z. Xiong, “Robust depth estimation for light field via spinning parallelogram operator,” Computer Vision and Image Understanding, vol. 145, pp. 148–159, 2016.
    [46] Y. Luo, W. Zhou, J. Fang, L. Liang, H. Zhang, and G. Dai, “Epi-patch based convolutional neural network for depth estimation on 4d light field,” in Neural Information Processing: 24th International Conference, pp. 642–652, 2017.
    [47] I. K. Park, K. M. Lee, et al., “Robust light field depth estimation using occlusion-noise aware data costs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 10, pp. 2484–2497, 2017.
    [48] H.-G. Jeon, J. Park, G. Choe, J. Park, Y. Bok, Y.-W. Tai, and I. S. Kweon, “Depth from a light field image with learning-based matching costs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 2, pp. 297–310, 2018.
    [49] H. Schilling, M. Diebold, C. Rother, and B. Jähne, “Trust your model: Light field depth estimation with inline occlusion handling,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4530–4538, 2018.
    [50] C. Shin, H.-G. Jeon, Y. Yoon, I. S. Kweon, and S. J. Kim, “Epinet: A fully-convolutional neural network using epipolar geometry for depth from light field images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4748–4757, 2018.
    [51] T. Leistner, H. Schilling, R. Mackowiak, S. Gumhold, and C. Rother, “Learning to think outside the box: Wide-baseline light field depth estimation with epi-shift,” in 2019 International Conference on 3D Vision (3DV), pp. 249–257, 2019.
    [52] K. Li, J. Zhang, R. Sun, X. Zhang, and J. Gao, “Epi-based oriented relation networks for light field depth estimation,” British Machine Vision Conference (BMVC), 2020.
    [53] A. Salem, H. Ibrahem, and H.-S. Kang, “Light field image super-resolution using deep residual networks on lenslet images,” Sensors, vol. 23, no. 4, p. 2018, 2023.
    [54] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF international conference on computer vision, pp. 10012–10022, 2021.
    [55] S. Wanner, S. Meister, and B. Goldlücke, “A dataset and evaluation methodology for depth estimation on 4d light fields,” Vision Modeling and Visualization, vol. 13, pp. 225–226, 2013.
    [56] M. L. Pendu, X. Jiang, and C. Guillemot, “A dataset and evaluation methodology for depth estimation on 4d light fields,” IEEE Transactions on Image Processing, vol. 27, pp. 1981–1993, 2018.
    [57] V. Vaish and A. Adams, “The (new) stanford light field archive,” Computer Graphics Laboratory, Stanford University, vol. 6, no. 7, 2008.
    [58] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600–612, 2004.
    [59] L. Zhang, L. Zhang, X. Mou, and D. Zhang, “Fsim: A feature similarity index for image quality assessment,” IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2378–2386, 2011.
    [60] A. S. Raj, M. Lowney, R. Shah, and G. Wetzstein, “Stanford lytro light field archive,” 2016. [Online]. Available: http://lightfields.stanford.edu/.

    無法下載圖示
    Full text public date 2026/06/16 (Internet public)
    Full text public date 2026/06/16 (National library)
    QR CODE