研究生: |
闕居緯 Ju-Wei Que |
---|---|
論文名稱: |
物聯網邊緣攝影機之動態影像去模糊技術及聯合決策 Dynamic Deblurring and Joint-Decision for IoT-enabled Edge Cameras |
指導教授: |
陸敬互
Ching-Hu Lu |
口試委員: |
蘇順豐
Shun-Feng Su 鍾聖倫 Sheng-Luen Chung 花凱龍 Kai-Lung Hua 黃正民 Cheng-Ming Huang 陸敬互 Ching-Hu Lu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 中文 |
論文頁數: | 117 |
中文關鍵詞: | 人工智慧 、深度學習 、機器學習 、影像去模糊 、影像處理 、物聯網 、邊緣運算 |
外文關鍵詞: | artificial intelligence, deep learning, machine learning, image deblurring, image processing, internet of things, edge computing |
相關次數: | 點閱:441 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
模糊影像常常嚴重影響電腦視覺系統服務的穩定性。隨著物聯網 (IoT) 的發展,結合人工智慧的邊緣計算技術之攝影機 (以下簡稱邊緣攝影機) 已能夠提升基於影像偵測之物聯網服務的強健性。雖然近年已有研究採用深度神經網路來進行影像去模糊,但既有研究因模型過於龐大,無法有效運行於資源有限之邊緣攝影機上。因此,本研究提出輕量化的網路模型,能夠於邊緣攝影機進行高效能運算,為了提升其學習成效,本研究提出「具備分離式空間注意力機制之倒置殘差模塊」使特徵圖各通道能夠萃取獨立的空間資訊;「邊緣感知性損失函式」可使深度網路學習如何有效恢復物體的邊緣細節。以此輕量化網路為基礎,本研究再提出「交錯式嵌入卷積長短期記憶網路」來大幅度提升影像生成的品質。實驗結果顯示,採直接映射式設計的模型類別上,本研究與既有研究相比,結構相似性指標 (structural similarity index,SSIM) 可提升0.77%,峰值訊噪比 (peak signal-to-noise ratio,PSNR) 可提升2.01%,運行速度更提升33.78%;在採生成對抗式設計的模型類別上,本研究與既有研究相比,SSIM可提升5.58%、PSNR可提升11.12%、運行速度也提升19.52%。由以上可知,本研究於影像品質與執行速度指標皆擁有絕對優勢。接著,既有去模糊研究皆未考量輸入影像的質量狀態,導致清晰影像被額外加工造成品質減損,為了提升影像的品質以及有效利用邊緣攝影機之運算資源,本研究提出「動態影像去模糊偵測模型」評估影像的質量狀態,作為是否需要經過去模糊處理的依據。實驗結果顯示,整合此偵測模型以及前述輕量化影像去模糊網路模型後,整體SSIM可提升0.21%,運行速度增加294.5%。以上可說明動態影像質量的評估可在提升生成影像品質下大幅度改善運行效率。最後,本研究透過「多相機影像去模糊之聯合決策」來展現前述所提出技術的效益,整合多台邊緣攝影機之辨識結果,能有效降低辨識錯誤的機率,進而提升物聯網服務的品質。
關鍵字:影像去模糊、輕量化深度網路、動態影像去模糊、邊緣運算、物聯網、聯合決策
Blur images often significantly influence the stability of a computer-vision system. With the development of Internet of Things (IoT), a camera leveraging artificial intelligence and edge computing (hereafter referred to as an edge camera) can enhance the robustness of an IoT service. Although there have been studies using deep neural networks (DNNs) for image de-blurring, the DNNs often cannot work effectively on an edge camera due to its limited resources. Therefore, our study proposes a lightweight DNN model that run with high performance on edge cameras. The model has “separate spatial attention inverted residual block (SSA-IRB)” to enhance its training effectiveness by extracting independent spatial information. The proposed “border perceptual loss” can effectively recovering details of object borders during model training. We also propose “alternately-embedded Convolutional LSTM” to greatly improve image quality. Experimental results show that, in terms of the models with the direct-mapping design, SSIM improved by 0.77%, PSNR improved by 2.01%, and speed improved by 33.78% with respect to existing studies. In terms of models with the GAN-based design, SSIM improved by 5.58%, PSNR improved by 11.12%, and speed improved by 19.52%. In addition, existing de-blurring studies did not consider the quality of an input image, thus causing an clear input image been processed by unnecessary deblurring. To address this issue, we propose “dynamic deblurring” to assess the quality of input images first for determining if deblurring should be undertaken. The results show that SSIM improved by 0.21% and speed improved by 294.5%. Finally, we demonstrate the benefits of the proposed approaches through “multi-camera deblurring joint-decision” for improving the quality of IoT services.
Keywords: Image deblurring, lightweight neural network, dynamic deblurring, edge computing, Internet of Things, joint-decision
[1] 林. 蔡中志, "運用錄影監視系統輔助取締交通違規之研究," 碩士, 交通管理研究所, 中央警察大學, 桃園縣.
[2] J. Sun, W. Cao, Z. Xu, and J. Ponce, "Learning a convolutional neural network for non-uniform motion blur removal," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 769-777.
[3] L. Xu, S. Zheng, and J. Jia, "Unnatural l0 sparse representation for natural image deblurring," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2013, pp. 1107-1114.
[4] Q. Shan, J. Jia, and A. J. A. t. o. g. Agarwala, "High-quality motion deblurring from a single image," vol. 27, no. 3, pp. 1-10, 2008.
[5] S. Cho and S. Lee, "Fast motion deblurring," in ACM SIGGRAPH Asia 2009 papers, 2009, pp. 1-8.
[6] X. Tao, H. Gao, X. Shen, J. Wang, and J. Jia, "Scale-recurrent network for deep image deblurring," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 8174-8182.
[7] S. Xingjian, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo, "Convolutional LSTM network: A machine learning approach for precipitation nowcasting," in Advances in neural information processing systems, 2015, pp. 802-810.
[8] J. Zhang et al., "Dynamic scene deblurring using spatially variant recurrent neural networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2521-2529.
[9] S. Liu, J. Pan, and M.-H. Yang, "Learning recursive filters for low-level vision via a hybrid neural network," in European Conference on Computer Vision, 2016, pp. 560-576: Springer.
[10] H. Zhang, Y. Dai, H. Li, and P. Koniusz, "Deep stacked hierarchical multi-patch network for image deblurring," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 5978-5986.
[11] S. Nah, T. Hyun Kim, and K. Mu Lee, "Deep multi-scale convolutional neural network for dynamic scene deblurring," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3883-3891.
[12] H. Gao, X. Tao, X. Shen, and J. Jia, "Dynamic scene deblurring with parameter selective sharing and nested skip connections," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 3848-3856.
[13] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700-4708.
[14] M. Ye, D. Lyu, and G. J. I. A. Chen, "Scale-Iterative Upscaling Network for Image Deblurring," vol. 8, pp. 18316-18325, 2020.
[15] I. Goodfellow et al., "Generative adversarial nets," in Advances in neural information processing systems, 2014, pp. 2672-2680.
[16] J. Pan, D. Sun, H. Pfister, and M.-H. Yang, "Blind image deblurring using dark channel prior," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1628-1636.
[17] K. He, J. Sun, X. J. I. t. o. p. a. Tang, and m. intelligence, "Single image haze removal using dark channel prior," vol. 33, no. 12, pp. 2341-2353, 2010.
[18] O. Kupyn, V. Budzan, M. Mykhailych, D. Mishkin, and J. Matas, "Deblurgan: Blind motion deblurring using conditional adversarial networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 8183-8192.
[19] O. Kupyn, T. Martyniuk, J. Wu, and Z. Wang, "Deblurgan-v2: Deblurring (orders-of-magnitude) faster and better," in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 8878-8887.
[20] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, "Improved training of wasserstein gans," in Advances in neural information processing systems, 2017, pp. 5767-5777.
[21] J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," in European conference on computer vision, 2016, pp. 694-711: Springer.
[22] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Thirty-first AAAI conference on artificial intelligence, 2017.
[23] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "Mobilenetv2: Inverted residuals and linear bottlenecks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 4510-4520.
[24] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2117-2125.
[25] H. Tong, M. Li, H. Zhang, and C. Zhang, "Blur detection for digital images using wavelet transform," in 2004 IEEE international conference on multimedia and expo (ICME)(IEEE Cat. No. 04TH8763), 2004, vol. 1, pp. 17-20: IEEE.
[26] P. Bhor, R. Gargote, R. Vhorkate, R. Yawle, V. J. I. J. o. S. Bairagi, and M. Engineering, "A no reference image blur detection using cumulative probability blur detection (CPBD) metric," vol. 1, no. 5, 2013.
[27] A. Mittal, A. K. Moorthy, and A. C. J. I. T. o. i. p. Bovik, "No-reference image quality assessment in the spatial domain," vol. 21, no. 12, pp. 4695-4708, 2012.
[28] L. Kang, P. Ye, Y. Li, and D. Doermann, "Convolutional neural networks for no-reference image quality assessment," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1733-1740.
[29] M. Suin, K. Purohit, and A. Rajagopalan, "Spatially-attentive patch-hierarchical network for adaptive motion deblurring," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 3606-3615.
[30] Q. Qi, J. Guo, and W. J. I. A. Jin, "Attention Network for Non-Uniform Deblurring," 2020.
[31] S. Mehta, M. Rastegari, A. Caspi, L. Shapiro, and H. Hajishirzi, "Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation," in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 552-568.
[32] C.-H. Lu and B.-E. Shao, "Environment-Aware Multiscene Image Enhancement for Internet of Things Enabled Edge Cameras," IEEE Systems Journal, 2020.
[33] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. J. a. p. a. Liao, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arxiv preprint arXiv:2004.10934, 2020.
[34] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
[35] L. Itti and C. Koch, "Computational modelling of visual attention," Nature reviews neuroscience, vol. 2, no. 3, pp. 194-203, 2001.
[36] Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, and Y. Fu, "Image super-resolution using very deep residual channel attention networks," in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 286-301.
[37] J. Hu, L. Shen, S. Albanie, G. Sun, and E. Wu, "Squeeze-and-Excitation Networks," ArXiv e-prints, Accessed on: September 01, 2017Available: https://ui.adsabs.harvard.edu/#abs/2017arXiv170901507H
[38] S. Woo, J. Park, J.-Y. Lee, and I. So Kweon, "Cbam: Convolutional block attention module," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3-19.
[39] Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, "ECA-net: Efficient channel attention for deep convolutional neural networks," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11534-11542.
[40] R. Saini, N. K. Jha, B. Das, S. Mittal, and C. K. Mohan, "ULSAM: Ultra-Lightweight Subspace Attention Module for Compact Convolutional Neural Networks," in The IEEE Winter Conference on Applications of Computer Vision, 2020, pp. 1627-1636.
[41] A. Howard et al., "Searching for mobilenetv3," in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 1314-1324.
[42] C.-H. Lu and S.-T. Chiu, "Multi-scene Sequentially Environment-aware Image Recursive Enhancement for IoT-enabled Edge Cameras," IEEE Transactions on Emerging Topics in Computing, 2020 (Submitted).
[43] C. Ledig et al., "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, vol. 2, no. 3, p. 4.
[44] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
[45] S. Xie and Z. Tu, "Holistically-nested edge detection," in Proceedings of the IEEE international conference on computer vision, 2015, pp. 1395-1403.
[46] C.-Y. Lee, S. Xie, P. Gallagher, Z. Zhang, and Z. Tu, "Deeply-supervised nets," in Artificial intelligence and statistics, 2015, pp. 562-570.
[47] A. Jolicoeur-Martineau, "The relativistic discriminator: a key element missing from standard GAN," arXiv preprint arXiv:1807.00734, 2018.
[48] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," arXiv preprint, 2017.
[49] T. Miyato, T. Kataoka, M. Koyama, and Y. J. a. p. a. Yoshida, "Spectral normalization for generative adversarial networks," 2018.
[50] S. Su, M. Delbracio, J. Wang, G. Sapiro, W. Heidrich, and O. Wang, "Deep video deblurring for hand-held cameras," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1279-1288.
[51] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
[52] https://www.tripwiremagazine.com/motion-blur-photography/, "Motion_Blur_Image."
[53] https://cn.depositphotos.com/stock-photos/single-flower.html, "Gaussian_Blur_Image."
[54] https://www.shutterstock.com/ru/video/clip-18580334-red-rose-on-branch-background-out-focus, "Out-of-Focus_Blur_Image."