簡易檢索 / 詳目顯示

研究生: 王稚淇
Chih-Chi Wang
論文名稱: 基於二階段生成對抗網路架構之真實水下影像改善方法
Real world Underwater Image Enhancement Based on Two stage GAN based Architecture
指導教授: 林昌鴻
Chang-Hong Lin
口試委員: 林宗男
Tsung-Nan Lin
吳沛遠
Pei-Yuan Wu
呂政修
Jenq-Shiou Leu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 76
中文關鍵詞: 水下影像增強水下影像復原影像改善深度學習生成對抗網路
外文關鍵詞: Underwater Image Enhancement, Underwater Image Restoration, Image Enhancement, Generative Adversarial Network (GAN), Convolutional Neural Network (CNN), deep learning
相關次數: 點閱:466下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 水下影像還原及增強在影像處理領域中是相當具有挑戰性的,光波在進入水中後容易受到吸收及散射影響,成像容易呈現不同程度的模糊、霧化、淡藍色及淡綠色,進而對水下機器人及人類視覺品質產生影響。現今有越來越多需要在水下執行特定任務的需求像是物體偵測,影像分割、物體追蹤或地形分析;為了提高準確率,在學界或業界都希望能夠發展出改善水下影像品質的方法。在現今也有許多水下影像改善及還原的演算法,在大多使用先驗知識或直接對影像做還原的方法雖能有一定的改善效果,但往往單一方法只能用在某些特定場景或是某些運算參數較難求得而造成實際使用上有一定的阻礙。而本論文所提出的方法是使用生成對抗的訓練方法並使用真實水下影像進行訓練,在我們製作的生成網路架構中,生成器分為兩個部分,第一個部分結合了多尺度及先驗知識製作出一個學習網路,第二部分使用了似U-Net架構的網路以加強第一階段輸出的改善影像,最後的結果我們與其他類似且基於生成對抗網路(GAN)及卷積網路(CNN)的方法進行比較。在影像品質上的比較,我們的結果與其他的方法相比能夠有更好的視覺效果、更自然的顏色及更遠的可視距離。在量化分數的比較結果,不論是在峰值信噪比(PSNR)或結構相似性(SSIM)都能夠有明顯較高的分數。


    The underwater images are suffered from the scattering and absorption of the light during propagation; the visual quality of images may be bad for either robot and human. Since underwater images look blurry, fogging, bluish, and greenish depends on the effects of absorption and scattering, it is essential to restore and enhance the image before applying on more applications. However, there are several kinds of research for enhancing based on prior knowledge and image recovery methods; but most of them can not handle the various scene. To address this problem, we proposed a new GAN-based architecture that consists of a two-stage generator. For the first stage enhancement network is designed to restore the image with multi-scale architecture and prior-knowledge modules. And the second stage uses a U-Net architecture for refining and restore the first stage output. We trained our proposed architecture by using a real-world underwater dataset called underwater image enhancement and benchmark and compare the performance using both qualitative and quantitative methods with other CNN-based and GAN-based methods. For qualitative comparison, our results are more pleasant, nature, and higher viewing distance. In quantitative comparison with other methods, we can get a much higher peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) score.

    摘要 I ABSTRACT II 致謝 III LIST OF CONTENTS IV LIST OF FIGURES VI LIST OF TABLES VIII CHAPTER 1 INTRODUCTIONS 1 1.1 Motivation 1 1.2 Contributions 6 1.3 Thesis Organization 7 CHAPTER 2 RELATED WORKS 8 2.1 Prior-driven Models 8 2.2 Data-driven Models 10 CHAPTER 3 PROPOSED METHODS 12 3.1 Data Augmentation and Preprocessing 13 3.1.1 Random Color Hue Shifting 14 3.1.2 Multi-scale Random Crop 17 3.1.3 Random Flip 18 3.1.4 Normalization 19 3.2 Training Flow 20 3.2.1 First Stage Enhancement Network 22 3.2.2 Second Stage Enhancement Network 26 3.2.3 Multi-scale Discriminator 30 3.2.4 Losses 32 CHAPTER 4 EXPERIMENTAL RESULTS 40 4.1 Experimental Environment 40 4.2 Underwater Datasets 41 4.2.1 Underwater Image Enhancement Benchmark Dataset (UIEB) 41 4.3 Training Details and Strategy 45 4.4 Evaluation and Results 46 4.4.1 Qualitative Comparison 46 4.4.2 Quantitative Comparison 49 CHAPTER 5 CONCLUSIONS and Future works 53 5.1 Conclusions 53 5.2 Future Works 54 REFERENCES 55 APPENDIX A 60 A.1 Details of Proposed Architecture 60 A.1.1 First Stage Enhancement Network 60 A.1.2 Second Stage Enhancement Network 62 A.1.3 Multi-scale Discriminator 64

    [1] C. Li, C. Guo, W. Ren, R. Cong, J. Hou, S. Kwong, and D. Tao, "An Underwater Image Enhancement Benchmark Dataset and Beyond," IEEE Transactions on Image Processing, vol. 29, pp. 4376-4389, 2020.
    [2] C. Li, J. Guo, R. Cong, Y. Pang, and B. Wang, "Underwater Image Enhancement by Dehazing With Minimum Information Loss and Histogram Distribution Prior," IEEE Transactions on Image Processing, vol. 25, no. 12, pp. 5664-5677, 2016.
    [3] K. B. Gibson, D. T. Vo, and T. Q. Nguyen, "An Investigation of Dehazing Effects on Image and Video Coding," IEEE Transactions on Image Processing, vol. 21, no. 2, pp. 662-673, 2012.
    [4] C. Ancuti, C. O. Ancuti, T. Haber, and P. Bekaert, "Enhancing underwater images and videos by fusion," in 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 81-88.
    [5] X. Fu, Z. Fan, M. Ling, Y. Huang, and X. Ding, "Two-step approach for single underwater image enhancement," in 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2017, pp. 789-794.
    [6] X. Fu, P. Zhuang, Y. Huang, Y. Liao, X. Zhang, and X. Ding, "A retinex-based enhancing approach for single underwater image," in 2014 IEEE International Conference on Image Processing (ICIP), 2014, pp. 4572-4576.
    [7] P. L. Drews, E. R. Nascimento, S. S. Botelho, and M. F. M. Campos, "Underwater depth estimation and image restoration based on single images," IEEE computer graphics and applications, vol. 36, no. 2, pp. 24-35, 2016.
    [8] C. Li, J. Guo, C. Guo, R. Cong, and J. Gong, "A hybrid method for underwater image correction," Pattern Recognition Letters, vol. 94, pp. 62-67, 2017.
    [9] Y. Peng, K. Cao, and P. C. Cosman, "Generalization of the dark channel prior for single image restoration," IEEE Transactions on Image Processing, vol. 27, no. 6, pp. 2856-2868, 2018.
    [10] A. Galdran, D. Pardo, A. Picón, and A. Alvarez-Gila, "Automatic red-channel underwater image restoration," Journal of Visual Communication and Image Representation, vol. 26, pp. 132-145, 2015.
    [11] Y. Peng and P. C. Cosman, "Underwater image restoration based on image blurriness and light absorption," IEEE Transactions on Image Processing, vol. 26, no. 4, pp. 1579-1594, 2017.
    [12] C. Wei, W. Wang, W. Yang, and J. Liu, "Deep retinex decomposition for low-light enhancement," arXiv preprint arXiv:1808.04560, 2018.
    [13] H. Zhou, S. Hadap, K. Sunkavalli, and D. W. Jacobs, "Deep Single-Image Portrait Relighting," in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 7194-7202.
    [14] J. Yu, Z. Lin, J. Yang, X. Shen, X. Lu, and T. S. Huang, "Free-form image inpainting with gated convolution," in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 4471-4480.
    [15] K. Nazeri, E. Ng, T. Joseph, F. Z. Qureshi, and M. Ebrahimi, "Edgeconnect: Generative image inpainting with adversarial edge learning," arXiv preprint arXiv:1901.00212, 2019.
    [16] G. Liu, F. A. Reda, K. J. Shih, T.-C. Wang, A. Tao, and B. Catanzaro, "Image inpainting for irregular holes using partial convolutions," in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 85-100.
    [17] J. Gast and S. Roth, "Deep Video Deblurring: The Devil is in the Details," in Proceedings of the IEEE International Conference on Computer Vision (ICCV) Workshops, 2019.
    [18] C. Fabbri, M. J. Islam, and J. Sattar, "Enhancing Underwater Imagery Using Generative Adversarial Networks," in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 7159-7165.
    [19] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2223-2232.
    [20] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and F.-F. Li, "Imagenet: A large-scale hierarchical image database," in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248-255.
    [21] Y. Guo, H. Li, and P. Zhuang, "Underwater Image Enhancement Using a Multiscale Dense Generative Adversarial Network," IEEE Journal of Oceanic Engineering, pp. 1-9, 2019.
    [22] M. J. Islam, Y. Xia, and J. Sattar, "Fast Underwater Image Enhancement for Improved Visual Perception," IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3227-3234, 2020.
    [23] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative Adversarial Nets," in Advances in Neural Information Processing Systems, 2014, pp. 2672-2680.
    [24] J. S. Jaffe, "Computer modeling and the design of optimal underwater imaging systems," IEEE Journal of Oceanic Engineering, vol. 15, no. 2, pp. 101-111, 1990.
    [25] B. McGlamery, A Computer Model For Underwater Camera Systems (Ocean Optics VI). SPIE, 1980.
    [26] H. Wen, Y. Tian, T. Huang, and W. Gao, "Single underwater image enhancement with a new optical model," in 2013 IEEE International Symposium on Circuits and Systems (ISCAS), 2013, pp. 753-756.
    [27] K. He, J. Sun, and X. Tang, "Single image haze removal using dark channel prior," IEEE transactions on pattern analysis and machine intelligence, vol. 33, no. 12, pp. 2341-2353, 2010.
    [28] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, "Image super-resolution using very deep residual channel attention networks," in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 286-301.
    [29] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, and Z. Wang, "Photo-realistic single image super-resolution using a generative adversarial network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 4681-4690.
    [30] W. Lai, J. Huang, N. Ahuja, and M. Yang, "Fast and Accurate Image Super-Resolution with Deep Laplacian Pyramid Networks," IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 11, pp. 2599-2613, 2019.
    [31] J. Johnson, A. Alahi, and F.-F. Li, "Perceptual losses for real-time style transfer and super-resolution," in European Conference on Computer Vision (CVPR), 2016: Springer, pp. 694-711.
    [32] L. A. Gatys, A. S. Ecker, and M. Bethge, "Image Style Transfer Using Convolutional Neural Networks," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2414-2423.
    [33] N. Wang, Y. Zhou, F. Han, H. Zhu, and Y. Zheng, "UWGAN: Underwater GAN for Real-world Underwater Color Restoration and Dehazing," arXiv preprint arXiv:1912.10269, 2019.
    [34] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, 2015: Springer, pp. 234-241.
    [35] C. Li, S. Anwar, and F. Porikli, "Underwater scene prior inspired deep underwater image and video enhancement," Pattern Recognition, vol. 98, p. 107038, 2020.
    [36] C. Shorten and T. Khoshgoftaar, "A survey on Image Data Augmentation for Deep Learning," Journal of Big Data, vol. 6, 2019.
    [37] G. H. Joblove and D. Greenberg, "Color spaces for computer graphics," in Proceedings of The 5th Annual Conference on Computer Graphics and Interactive Techniques, 1978, pp. 20-25.
    [38] T. Wang, M. Liu, J. Zhu, A. Tao, J. Kautz, and B. Catanzaro, "High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs," presented at the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
    [39] W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, and Z. Wang, "Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 1874-1883.
    [40] C. Liu and W. Meng, "Removal of water scattering," in 2010 2nd International Conference on Computer Engineering and Technology, 2010, vol. 2, pp. V2-35-V2-39.
    [41] H. Yang, P. Chen, C. Huang, Y. Zhuang, and Y. Shiau, "Low Complexity Underwater Image Enhancement Based on Dark Channel Prior," in 2011 Second International Conference on Innovations in Bio-inspired Computing and Applications, 2011, pp. 17-20.
    [42] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
    [43] L. Xiao, Q. Yan, and S. Deng, "Scene classification with improved AlexNet model," in 2017 12th International Conference on Intelligent Systems and Knowledge Engineering (ISKE), 2017, pp. 1-6.
    [44] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431-3440.
    [45] A. Odena, V. Dumoulin, and C. Olah, "Deconvolution and checkerboard artifacts," Distill, vol. 1, no. 10, 2016. [Online]. Available: https://distill.pub/2016/deconv-checkerboard/?source=post_page.
    [46] W. Ren, J. Tian, Q. Wang, and Y. Tang, "Dually Connected Deraining Net Using Pixel-Wise Attention," IEEE Signal Processing Letters, vol. 27, pp. 316-320, 2020.
    [47] C. Li and M. Wand, "Precomputed real-time texture synthesis with markovian generative adversarial networks," in European conference on computer vision, 2016: Springer, pp. 702-716.
    [48] J. H. Lim and J. C. Ye, "Geometric GAN," arXiv preprint arXiv:1705.02894, 2017.
    [49] A. Bruhn, J. Weickert, and C. Schnörr, "Lucas/Kanade Meets Horn/Schunck: Combining Local and Global Optic Flow Methods," International Journal of Computer Vision, vol. 61, no. 3, pp. 211-231, 2005. [Online]. Available: https://doi.org/10.1023/B:VISI.0000045324.43199.43.
    [50] T. Treibitz and Y. Y. Schechner, "Active Polarization Descattering," IEEE transactions on pattern analysis and machine intelligence, vol. 31, no. 3, pp. 385-399, 2009.
    [51] W. Ren, S. Liu, H. Zhang, J. Pan, X. Cao, and M.-H. Yang, "Single image dehazing via multi-scale convolutional neural networks," in European Conference on Computer Vision, 2016: Springer, pp. 154-169.
    [52] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
    [53] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter, "Gans trained by a two time-scale update rule converge to a local nash equilibrium," in Advances in Neural Information Processing Systems, 2017, pp. 6626-6637.
    [54] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, "Spectral normalization for generative adversarial networks," arXiv preprint arXiv:1802.05957, 2018.
    [55] D. Ulyanov, A. Vedaldi, and V. Lempitsky, "Instance normalization: The missing ingredient for fast stylization," arXiv preprint arXiv:1607.08022, 2016.
    [56] V. VQEG, "Final report from the video quality expert group on the validation of objective models of video quality assessment. 2000," Video Quality Expert Groups.
    [57] W. Zhou, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, 2004.
    [58] K. Panetta, C. Gao, and S. Agaian, "Human-Visual-System-Inspired Underwater Image Quality Measures," IEEE Journal of Oceanic Engineering, vol. 41, no. 3, pp. 541-551, 2016.
    [59] T. Salimans and D. P. Kingma, "Weight normalization: A simple reparameterization to accelerate training of deep neural networks," in Advances in neural information processing systems, 2016, pp. 901-909.
    [60] A. F. Agarap, "Deep learning using rectified linear units (relu)," arXiv preprint arXiv:1803.08375, 2018.
    [61] A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in International Conference on Machine Learning, 2013, vol. 30, no. 1, p. 3.

    無法下載圖示 全文公開日期 2025/07/24 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE