以深度分離卷積及特徵約束的深度學習方法應用於曝光影像還原之研究及其FPGA實現

簡易檢索 / 詳目顯示

回結果列表

研究生：	李彥寬 Yan-Kuan Li
論文名稱：	以深度分離卷積及特徵約束的深度學習方法應用於曝光影像還原之研究及其FPGA實現 Deep Learning with Depth-wise Separable Convolution and Feature Constraint for Exposure Image Restoration and Its FPGA Implementation
指導教授：	楊振雄 Cheng-Hsiung Yang
口試委員:	徐勝均 Sheng-Dong Xu 吳常熙 Chang-Si Wu 顏志達 Chih-Ta Yen
學位類別：	碩士 Master
系所名稱：	工程學院 - 自動化及控制研究所 Graduate Institute of Automation and Control
論文出版年：	2023
畢業學年度：	111
語文別：	中文
論文頁數：	104
中文關鍵詞：	曝光不足、過度曝光、深度學習、FPGA
外文關鍵詞：	Underexposure, Overexposure, Deep Learning, FPGA
相關次數：	點閱：262 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

運用不適當的曝光來捕捉照片到現今依然是造成相機拍攝瑕疵的一個主要來源，曝
光的問題可分為：曝光不足，即相機的曝光時間過短，造成影像較暗而難以辨識；或曝
光過度，也就是曝光的時間過長，造成影像部分區域被沖淡或較亮。這些極端光照條件
通常會給機器和人類視覺帶來許多麻煩，並且都會大大降低影像的對比度和視覺吸引力。
以往的許多工作主要都集中在曝光不足的情況下，而這些影像通常是在低光源環境
下拍攝的，並且在提高影像質量方面取得了可靠的成果。而直到最近的曝光校正工作開
始關注在曝光不足以及曝光過度，這些工作也都達到了先進水平。然而，它往往會產生
與真實情況不一致的影像，有時還會出現明顯的人工潤色痕跡。
為了解決這一侷限性，在本實驗中提出了一個新的用於曝光校正的神經網路結構，
它對各種光照條件都具有魯棒性。所提出的模型同時針對曝光不足和曝光過度的影像，
引入了特徵約束損失以及提出了利用分離卷積進行輕量化的影像特徵增強模塊，使網路
能夠學習特徵空間中的曝光變量，從而保證影像曝光的一致性。
本文廣泛的實驗表明，在幾個公共數據集中，本文提出的方法在錯誤曝光值渲染的
影像上取得了與現有的方法相比較好的結果，並對遭受曝光不足或曝光過度的影像產生
了顯著的改善，並且使用 Vitis AI在 FPGA上進行實現。

The use of improper exposure to capture photographs is still a major source of camera photography defects, exposure problems can be divided into: underexposure, that is, the camera's exposure time is too short, resulting in a darker image that is difficult to recognize; or overexposure, that is, the exposure time is too long, resulting in part of the image is washed out or brighter. These extreme lighting conditions often cause many problems for both machine and human vision, and both greatly reduce the contrast and visual appeal of the image.
Much of the previous work has focused on underexposure, often in low-light environments, and has yielded solid results in improving image quality. And until recent exposure correction work began to focus on underexposure as well as overexposure, these efforts have reached an advanced level. However, it tends to produce images that do not match reality, sometimes with obvious traces of artificial coloring.
To address this limitation, a new neural network structure for exposure correction is proposed in this experiment, which is robust to various lighting conditions. The proposed model introduces feature constraint loss for both underexposed and overexposed images and proposes a lightweight image feature enhancement module using depth-wise separable convolution, which enables the network to learn the exposure variables in the feature space to ensure the consistency of image exposure.
Extensive experiments in this paper show that the method proposed in this paper achieves better results compared to existing methods on images rendered with incorrect exposure values in several public datasets and produces significant improvements on images suffering from underexposure or overexposure, and is implemented on FPGAs using Vitis AI.

致謝 I
摘要 II
Abstract III
目錄 IV
圖目錄 VI
表目錄 VIII
第一章 緒論 1
1.1 前言 1
1.2 文獻回顧 1
1.3 研究動機 5
1.4 大綱 5
第二章 深度學習與注意力機制演算法 7
2.1 卷積神經網路 7
2.2 注意力機制 13
2.3 變換器 18
2.3.1 自注意力機制 20
2.3.2 多頭注意力機制 21
2.4 評估指標 23
第三章 曝光校正深度學習模型實現 33
3.1 深度學習框架 33
3.2 神經網路模型組件 39
3.2.1 激勵函數 40
3.2.2 優化器 44
3.2.3 損失函數 50
3.3 基於 視網膜形成理論的曝光校正模型 55
第四章 FPGA實現 62
4.1 Xilinx ZCU104開發版硬體配置 62
4.2 Zynq系統運行設計流程 65
4.3 Vitis AI 67
4.4 FPGA實現 68
第五章 實驗結果與分析 72
5.1 實驗環境及裝置 72
5.2 訓練與測試資料集 73
5.3 實驗數據比較 74
第六章 結論以及未來展望 87
6.1 結論 87
6.2 未來展望 87
參考文獻 89
                                

[1] Cheng, H.-D. and X. Shi, A simple and effective histogram equalization approach to image enhancement. Digital signal processing, 2004. 14(2): p. 158-170.
[2] Pizer, S.M., et al., Adaptive histogram equalization and its variations. Computer vision, graphics, and image processing, 1987. 39(3): p. 355-368.
[3] Huang, S.-C., F.-C. Cheng, and Y.-S. Chiu, Efficient contrast enhancement using adaptive gamma correction with weighting distribution. IEEE transactions on image processing, 2012. 22(3): p. 1032-1041.
[4] Xu, K., et al. Learning to restore low-light images via decomposition-and-enhancement. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[5] Li, J., et al., Luminance-aware pyramid network for low-light image enhancement. IEEE Transactions on Multimedia, 2020. 23: p. 3153-3165.
[6] Moran, S., et al. Deeplpf: Deep local parametric filters for image enhancement. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[7] Gharbi, M., et al., Deep bilateral learning for real-time image enhancement. ACM Transactions on Graphics (TOG), 2017. 36(4): p. 1-12.
[8] Chen, C., et al. Learning to see in the dark. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[9] Ronneberger, O., P. Fischer, and T. Brox. U-net: Convolutional networks for biomedical image segmentation. in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18. 2015. Springer.
[10] Lamba, M. and K. Mitra. Restoring extremely dark images in real time. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[11] Cai, J., S. Gu, and L. Zhang, Learning a deep single image contrast enhancer from multi-exposure images. IEEE Transactions on Image Processing, 2018. 27(4): p. 2049-2062.
[12] Afifi, M., et al. Learning multi-scale photo exposure correction. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[13] Land, E.H. and J.J. McCann, Lightness and retinex theory. Josa, 1971. 61(1): p. 1-11.
[14] Wang, R., et al. Underexposed photo enhancement using deep illumination estimation. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
[15] Wei, C., et al., Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560, 2018.
[16] Jiang, Y., et al., Enlightengan: Deep light enhancement without paired supervision. IEEE transactions on image processing, 2021. 30: p. 2340-2349.
[17] Zhu, A., et al. Zero-shot restoration of underexposed images via robust retinex decomposition. in 2020 IEEE International Conference on Multimedia and Expo (ICME). 2020. IEEE.
[18] Zhang, L., et al. Zero-shot restoration of back-lit images using deep internal learning. in Proceedings of the 27th ACM international conference on multimedia. 2019.
[19] Guo, C., et al. Zero-reference deep curve estimation for low-light image enhancement. in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.
[20] Zheng, S. and G. Gupta. Semantic-guided zero-shot learning for low-light image/video enhancement. in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2022.
[21] Wang, W., W. Yang, and J. Liu. Hla-face: Joint high-low adaptation for low light face detection. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[22] Mahdi, F.P., et al., Face recognition-based real-time system for surveillance. Intelligent Decision Technologies, 2017. 11(1): p. 79-92.
[23] Zhao, X., et al. A faster RCNN-based pedestrian detection system. in 2016 IEEE 84th Vehicular Technology Conference (VTC-Fall). 2016. IEEE.
[24] Albawi, S., T.A. Mohammed, and S. Al-Zawi. Understanding of a convolutional neural network. in 2017 international conference on engineering and technology (ICET). 2017. Ieee.
[25] Szegedy, C., et al. Going deeper with convolutions. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[26] Simonyan, K. and A. Zisserman, Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
[27] He, K., et al. Deep residual learning for image recognition. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[28] Wang, F., et al. Residual attention network for image classification. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[29] Bahdanau, D., K. Cho, and Y. Bengio, Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
[30] Cho, K., et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
[31] Sherstinsky, A., Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenomena, 2020. 404: p. 132306.
[32] Staudemeyer, R.C. and E.R. Morris, Understanding LSTM--a tutorial into long short-term memory recurrent neural networks. arXiv preprint arXiv:1909.09586, 2019.
[33] Popescu, M.-C., et al., Multilayer perceptron and neural networks. WSEAS Transactions on Circuits and Systems, 2009. 8(7): p. 579-588.
[34] Vaswani, A., et al., Attention is all you need. Advances in neural information processing systems, 2017. 30.
[35] Wang, Z., et al., Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 2004. 13(4): p. 600-612.
[36] Sheikh, H.R. and A.C. Bovik, Image information and visual quality. IEEE Transactions on image processing, 2006. 15(2): p. 430-444.
[37] Vincent, P., et al. Extracting and composing robust features with denoising autoencoders. in Proceedings of the 25th international conference on Machine learning. 2008.
[38] Paszke, A., et al., Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 2019. 32.
[39] Abadi, M., et al. Tensorflow: a system for large-scale machine learning. in Osdi. 2016. Savannah, GA, USA.
[40] Van Merriënboer, B., et al., Blocks and fuel: Frameworks for deep learning. arXiv preprint arXiv:1506.00619, 2015.
[41] Misra, D., Mish: A self regularized non-monotonic activation function. arXiv preprint arXiv:1908.08681, 2019.
[42] Duchi, J., E. Hazan, and Y. Singer, Adaptive subgradient methods for online learning and stochastic optimization. Journal of machine learning research, 2011. 12(7).
[43] Ruder, S., An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747, 2016.
[44] Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
[45] Goodfellow, I., et al., Generative adversarial networks. Communications of the ACM, 2020. 63(11): p. 139-144.
[46] Johnson, J., A. Alahi, and L. Fei-Fei. Perceptual losses for real-time style transfer and super-resolution. in Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14. 2016. Springer.
[47] McCann, J., Retinex Theory, in Encyclopedia of Color Science and Technology, M.R. Luo, Editor. 2016, Springer New York: New York, NY. p. 1118-1125.
[48] Fu, X., et al. A weighted variational model for simultaneous reflectance and illumination estimation. in Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[49] Guo, X., Y. Li, and H. Ling, LIME: Low-light image enhancement via illumination map estimation. IEEE Transactions on image processing, 2016. 26(2): p. 982-993.
[50] Han, Q., et al., On the connection between local attention and dynamic depth-wise convolution. arXiv preprint arXiv:2106.04263, 2021.
[51] Carion, N., et al. End-to-end object detection with transformers. in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16. 2020. Springer.
[52] Bychkovsky, V., et al. Learning photographic global tonal adjustment with a database of input/output image pairs. in CVPR 2011. 2011. IEEE.

全文公開日期 2024/08/08 (校外網路)
全文公開日期 2024/08/08 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文