簡易檢索 / 詳目顯示

研究生: 曾友瑋
Yo-Wei Tseng
論文名稱: 結合色彩校正和擴增技術用於水下物體辨識
Color-correction and Augmentation for Underwater Object Recognition
指導教授: 李敏凡
Min-Fan Lee
口試委員: 蔡明忠
Ming-Chung Tsai
夏筱明
Hsiao-Ming Shiah
徐道賢
Tao-Hsien Hsu
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 96
中文關鍵詞: 深度學習影像擴增色彩校正水下物體辨識
外文關鍵詞: Deep learning, Image Augmentation, Color correction, Underwater object recognition
相關次數: 點閱:691下載:11
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 水下的影像辨識目前尚未被廣泛應用,原因是因為水下的能見度及色偏的問題。許多文獻使用不同的影像處理方法來改善辨識問題,但卻未討論先改善訓練資料。本篇研究利用提高訓練資料的可用度,以提升模組的辨識率。
    本研究將利用現有最熱門的演算法Single Shot Detector (SSD)和MobileNets所結合的模型,搭配色彩校正(Color correction)及擴增(Augmentation)技術組成三種個案。每種個案訓練時,均以深度學習(Deep learning)為基底生成模型,利用水中光線稍微不足的區域,驗證及探討不同的個案。
    實驗結果驗證,在辨識目標僅一種類別的前提下,經影像處理及不利用擴增技術所生成的模型,辨識效果係所有實驗裡表現最佳的。另外,當辨識目標經過白平衡處理,也可提升辨識效果。


    Underwater image recognition has not been widely used because of underwater visibility and color cast. Many literatures use different image processing methods to improve the identification problem, but do not discuss improving the training data first. This study has improved the recognition rate of models by improving the availability of training data.
    The study used the combination of the most popular algorithms, Single Shot Detector (SSD) and MobileNets, with color correction (white balance and dehazing) and augmentation techniques to form three cases. Each case is trained with a deep learning model. Different cases were verified and discussed in areas with insufficient light in the water.
    Under the premise of identifying only one category, the experimental results verified that the combination of application performed by processed data plus non-augmentation technique can achieve the best results.. In addition, when using the model, the input image can be processed by white balance to improve the recognition rate.

    誌謝 i 摘要 ii Abstract iii Contents iv List of Figures vii List of Tables x Chapter 1 Introduction 1 Chapter 2 Methods 5 2.1 Data Collection 5 2.1.1 Image processing 5 2.1.2 Data types 10 2.1.3 Labeling 11 2.2 Convolutional Neural Networks (CNN) 13 2.2.1 Neural networks 13 2.2.2 Multi-layer networks 14 2.2.3 Convolutional layer 15 2.2.4 Activation function types 17 2.2.5 Pooling and stride 18 2.2.6 Fully-connected layer 20 2.2.7 Data augmentation 22 2.3 SSD-MobileNet Detector 26 2.3.1 Single Shot Detector (SSD) 27 2.3.2 MobileNet 34 2.3.3 Tensorflow 40 Chapter 3 Results 41 3.1 Introduction 43 3.1.1 Device 43 3.1.2 Experimental settings 46 3.1.3 Image snapshot 47 3.1.4 Limitation 48 3.2 Data Preprocessing 48 3.2.1 Comparison of color correction and dehazing 48 3.2.2 Image validation 50 3.2.3 Datasets 52 3.3 Data Training 54 3.3.1 Environment 54 3.3.2 Label convert 54 3.3.3 Cases training 57 3.4 Tests 64 3.4.1 Test Aim 64 3.4.2 Variables 65 3.4.3 Cases study 65 3.5 Summary 75 Chapter 4 Conclusions 76 References 79

    [1] K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," ArXiv e-prints, vol. 1512, Accessed on: December 1, 2015 Available: http://adsabs.harvard.edu/abs/2015arXiv151203385H
    [2] D. L. Rizzini, F. Kallasi, F. Oleari, and S. Caselli, "Investigation of Vision-Based Underwater Object Detection with Multiple Datasets," International Journal of Advanced Robotic Systems, vol. 12, no. 6, p. 77, 2015/06/01 2015.
    [3] C. L. Braun and S. N. Smirnov, "Why is water blue?," Journal of Chemical Education, vol. 70, no. 8, pp. 612-614, 1993.
    [4] E. S. Fry, "Reply to criticisms of the Pope and Fry paper on pure water absorption made in a comment by Quickenden et al," Applied Optics, vol. 39, no. 31, pp. 5843-5846, 2000/11/01 2000.
    [5] J. Perez, A. C. Attanasio, N. Nechyporenko, and P. J. Sanz, "A Deep Learning Approach for Underwater Image Enhancement," Cham, 2017, pp. 183-192: Springer International Publishing.
    [6] J. S. Romero, L. M. Procel, L. Trojman, and D. Verdier, "Implementation and optimization of the algorithm of automatic color enhancement in digital images," in 2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC), 2017, pp. 1-6.
    [7] N. Limare, J. L. Lisani, J.-M. Morel, A. B. Petro, and C. Sbert, "Simplest Color Balance," IPOL Journal, vol. 1, 2011.
    [8] J. Y. Chiang and Y. C. Chen, "Underwater image enhancement by wavelength compensation and dehazing," IEEE Trans Image Process, vol. 21, no. 4, pp. 1756-69, Apr 2012.
    [9] H. Lu, Y. Li, and S. Serikawa, "Underwater image enhancement using guided trigonometric bilateral filter and fast automatic color correction," in 2013 IEEE International Conference on Image Processing, 2013, pp. 3412-3416.
    [10] E. Jannik Bjerrum, "SMILES Enumeration as Data Augmentation for Neural Network Modeling of Molecules," ArXiv e-prints, vol. 1703, Accessed on: March 1, 2017 Available: http://adsabs.harvard.edu/abs/2017arXiv170307076J
    [11] L. Perez and J. Wang, "The Effectiveness of Data Augmentation in Image Classification using Deep Learning," ArXiv e-prints, vol. 1712, Accessed on: December 1, 2017 Available: http://adsabs.harvard.edu/abs/2017arXiv171204621P
    [12] J. Yao, "Investigation into underwater object recognition and tracking for the SAUC-E competition," Master, Computer Science, Bath, 2008.
    [13] C. M. Barngrover, "Computer Vision Techniques for Underwater Navigation," Master, Computer Science, UNIVERSITY OF CALIFORNIA, 2010.
    [14] R. Pérez-Alcocer, L. A. Torres-Méndez, E. Olguín-Díaz, and A. A. Maldonado-Ramírez, "Vision-Based Autonomous Underwater Vehicle Navigation in Poor Visibility Conditions Using a Model-Free Robust Control," Journal of Sensors, vol. 2016, pp. 1-16, 2016.
    [15] X. W. Chen and X. Lin, "Big Data Deep Learning: Challenges and Perspectives," IEEE Access, vol. 2, pp. 514-525, 2014.
    [16] H. Jun-yan, C. Yi-lin, W. Jing, and W. Xiao-xia, "Robust automatic white balance algorithm using gray color points in images," IEEE Transactions on Consumer Electronics, vol. 52, no. 2, pp. 541-546, 2006.
    [17] B. Cai, X. Xu, K. Jia, C. Qing, and D. Tao, "DehazeNet: An End-to-End System for Single Image Haze Removal," IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5187-5198, 2016.
    [18] K. B. Gibson, D. T. Vo, and T. Q. Nguyen, "An investigation of dehazing effects on image and video coding," IEEE Trans Image Process, vol. 21, no. 2, pp. 662-73, Feb 2012.
    [19] K. He, J. Sun, and X. Tang, "Single Image Haze Removal Using Dark Channel Prior," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2341-2353, 2011.
    [20] J.-H. Kim, W.-D. Jang, J.-Y. Sim, and C.-S. Kim, "Optimized contrast enhancement for real-time image and video dehazing," Journal of Visual Communication and Image Representation, vol. 24, no. 3, pp. 410-425, 2013.
    [21] G. Meng, Y. Wang, J. Duan, S. Xiang, and C. Pan, "Efficient Image Dehazing with Boundary Constraint and Contextual Regularization," in 2013 IEEE International Conference on Computer Vision, 2013, pp. 617-624.
    [22] M. Bengtsson, "Color adjustment of digital images of clothes for truthful rendering," Master, Computer science and engineering, Halmstad University, 2016.
    [23] M. R. Boutell, J. Luo, X. Shen, and C. M. Brown, "Learning multi-label scene classification," Pattern Recognition, vol. 37, pp. 1757-1771, 2004.
    [24] P. Domingos, "A few useful things to know about machine learning," Communications of the ACM, vol. 55, no. 10, p. 78, 2012.
    [25] Y. LeCun and Y. Bengio, "Convolutional networks for images, speech, and time series," in The handbook of brain theory and neural networks, A. A. Michael, Ed.: MIT Press, 1998, pp. 255-258.
    [26] Y. LeCun, L. e. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, 1998, vol. 86, no. 11, pp. 2278-2324.
    [27] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," presented at the Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 1, Lake Tahoe, Nevada, 2012.
    [28] D. C. Cire et al., "Flexible, high performance convolutional neural networks for image classification," presented at the Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two, Barcelona, Catalonia, Spain, 2011.
    [29] M. v. Gerven and S. Bohte, Artificial Neural Networks as Models of Neural Information Processing. Frontiers, 2018.
    [30] B. Schölkopf, A. Smola, and K. Müller, "Nonlinear Component Analysis as a Kernel Eigenvalue Problem," Neural Computation, vol. 10, no. 5, pp. 1299-1319, 1998.
    [31] C. Youngmin and K. S. Lawrence, "Kernel Methods for Deep Learning," pp. 342--350, 2009.
    [32] K. He, J. Sun, and X. Tang, "Guided image filtering," presented at the Proceedings of the 11th European conference on Computer vision: Part I, Heraklion, Crete, Greece, 2010.
    [33] K. Tae Keun, P. Joon Ki, and K. Bong Soon, "Contrast enhancement system using spatially adaptive histogram equalization with temporal filtering," IEEE Transactions on Consumer Electronics, vol. 44, no. 1, pp. 82-87, 1998.
    [34] R. H. Hahnloser, R. Sarpeshkar, M. A. Mahowald, R. J. Douglas, and H. S. Seung, "Digital selection and analogue amplification coexist in a cortex-inspired silicon circuit," Nature, vol. 405, no. 6789, pp. 947-51, Jun 22 2000.
    [35] M. Mohri and A. Rostamizadeh, Foundations of Machine Learning. The MIT Press, 2012.
    [36] M. Riesenhuber and T. Poggio, "Hierarchical models of object recognition in cortex," Nature Neuroscience, Article vol. 2, p. 1019, 11/01/online 1999.
    [37] D. Cireşan, U. Meier, and J. Schmidhuber, "Multi-column Deep Neural Networks for Image Classification," ArXiv e-prints, vol. 1202, Accessed on: February 1, 2012 Available: http://adsabs.harvard.edu/abs/2012arXiv1202.2745C
    [38] Y. LeCun et al., "Backpropagation Applied to Handwritten Zip Code Recognition," Neural Computation, vol. 1, no. 4, pp. 541-551, 1989.
    [39] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Trans Pattern Anal Mach Intell, vol. 39, no. 6, pp. 1137-1149, Jun 2017.
    [40] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," ArXiv e-prints, vol. 1506, Accessed on: June 1, 2015 Available: http://adsabs.harvard.edu/abs/2015arXiv150602640R
    [41] W. Liu et al., "SSD: Single Shot MultiBox Detector," ArXiv e-prints, vol. 1512, Accessed on: December 1, 2015 Available: http://adsabs.harvard.edu/abs/2015arXiv151202325L
    [42] A. G. Howard et al., "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," ArXiv e-prints, vol. 1704, Accessed on: April 1, 2017 Available: http://adsabs.harvard.edu/abs/2017arXiv170404861H
    [43] M. Abadi et al., "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems," ArXiv e-prints, vol. 1603, Accessed on: March 1, 2016 Available: http://adsabs.harvard.edu/abs/2016arXiv160304467A
    [44] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," ArXiv e-prints, vol. 1409, Accessed on: September 1, 2014 Available: http://adsabs.harvard.edu/abs/2014arXiv1409.1556S
    [45] C. Bishop, Pattern Recognition and Machine Learning, 1 ed. (Information Science and Statistics). Springer-Verlag New York, 2006.
    [46] J. Hosang, R. Benenson, and B. Schiele, "Learning non-maximum suppression," ArXiv e-prints, vol. 1705, Accessed on: May 1, 2017 Available: http://adsabs.harvard.edu/abs/2017arXiv170502950H
    [47] S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," ArXiv e-prints, vol. 1502, Accessed on: February 1, 2015 Available: http://adsabs.harvard.edu/abs/2015arXiv150203167I
    [48] C. F. Bohren and D. R. Huffman, Absorption and Scattering of Light by Small Particles. Willey, 1983.
    [49] D. C. Brown, "Decentering distortion of lenses," in Photogrammetric Engineering vol. 32, ed, 1966, pp. 444-462.
    [50] N. Otsu, "A Threshold Selection Method from Gray-Level Histograms," IEEE Transactions on Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 62-66, 1979.

    QR CODE