簡易檢索 / 詳目顯示

研究生: 邵伯恩
Bo-En Shao
論文名稱: 物聯網邊緣攝影機之多場景影像強化技術
Multi-scene Image Enhancement for IoT-enabled Edge Cameras
指導教授: 陸敬互
Ching-Hu Lu
口試委員: 花凱龍
Kai-Lung Hua
鍾聖倫
Sheng-Luen Chung
蘇順豐
Shun-Feng Su
黃正民
Cheng-Min Huang
陸敬互
Ching-Hu Lu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 94
中文關鍵詞: 多場景影像強化輕量化神經網路環境感知模型佈署邊緣運算物聯網
外文關鍵詞: multi-scene image enhancement, lightweight neural network, enviroment-aware model deployment, edge computing, Internet of Things
相關次數: 點閱:299下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著物聯網的發展,能發揮邊緣計算效益的攝影機 (本研究稱為邊緣攝影機) 結合人工智慧技術提供物聯網加值服務已成了可能。然而,邊緣攝影機身處多變的環境,其擷取之影像可能受天氣、光線與雜訊干擾。除了影響視覺品質,還會降低電腦視覺系統服務提供的品質與穩定性。為了讓邊緣攝影機的輸入影像品質在多變環境中維持穩定,近年已有研究提出使用深度神經網路改善影像品質。然而過去的訓練技術在損失評估的多樣性與訓練的穩定性上仍有改進空間。因此,本研究提出「基於相對性對抗網路的影像強化訓練框架」,其結合統計性損失、感知性損失以及穩定的相對性對抗訓練設計,使不同種類的損失函數能截長補短的提升影像強化網路的訓練成效,並能透過調整損失組合來最佳化不同場景的影像強化視覺品質。其次,由於既有邊緣攝影機的硬體性能仍十分有限,難以順暢運行一般複雜的影像強化神經網路,據此,本研究專為邊緣攝影機提出「輕量影像強化殘差網路」,其使用高效率的分離卷積運算作為基礎,在盡可能維持影像強化成效的情況下降低網路的計算複雜度,且此網路的內部結構是模組化的,能根據不同的硬體規格進行最佳化影像強化網路的執行效率。最後,由於邊緣攝影機所處的環境是會隨時間不斷變化的,根據環境狀況的不同其影像強化需求也會應隨著改變,本研究提出「環境感知模型佈署模組」,經由邊緣攝影機擷取的影像來判斷當前環境的狀況,當環境出現變化且當前的影像強化網路模型不再適用時,便會根據環境狀況調度影像強化模型,以維持邊緣攝影機之影像品質穩定。實驗結果顯示,使用「基於相對性對抗網路的影像強化訓練框架」之複合損失訓練的影像強化網路產出之強化影像在質地與結構清晰度上有明顯的改善,且在與過去研究接近的畫質水平下,本研究的「輕量影像強化殘差網路」能達到30%以上的運行速度提升。在模擬多變環境的串流資料中使用「環境感知模型佈署模組」進行影像強化模型調度,在天氣發生變化後能夠將輸入影像的平均PSNR由17.78提升至28.21,平均SSIM則由0.798提升至0.932,大大提升了動態環境中輸入影像的品質穩定度。


    With the development of Internet of Things (IoT), it is feasible to incorporate artificial intelligence into an IoT-enabled camera (one with the ability to leverage edge intelligence, hereafter referred to edge camera) to provide various value-added services. However, environmental interference (e.g., rainstorm) can significantly degrade the visual quality of an image into an edge camera, and thus affecting the image quality of computer vision systems. In order to provide robust IoT services, the stability of images into computer vision systems needs to be maintained. Deep neural networks have been proven effective in image enhancement; however, the training methodologies proposed by previous studies often have stability issue and suffer from inflexible loss composition. To address this issue, we propose a training framework based on relativistic average generative adversarial network (RaGAN), and the framework incorporates a composite loss, including statistical and perceptual loss, and assisted with a stable adversarial training design. In our framework, three kinds of losses can complement each other, further enhancing image quality under different weather scenarios. Next, nowaday edge cameras often have restricted computing power and cannot run complex neural networks efficiently. To solve such an issue, we design a lightweight residual enhancement network, which incorporates high-efficient convolution operations as basic building blocks. With this modular design, the network can be easily optimized for different hardware specifications. Finally, an edge camera is deployed in a dynamic real-life environment, the weather-related interference it encounters can change over time, so the types of image enhancement should also be changed accordingly. To accommodate various weather conditions, we design an environment-aware model deployer, which can detect the weather condition and deploy the most suitable image enhancement model to counteract the effect caused by undesirable interference to maintain as much image quality as possible. The experiment results show that the models that were trained via the proposed composite loss on our relativistic training framework can better enhance image textures with more details. Compared with previous research, the lightweight residual enhancement network can reduce the execution time of the model by over 30% given similar effectiveness in image enhancement. Finally, in the streaming image data that simulates change weather conditions, the average PSNR increased from 17.78 to 28.21, and the average SSIM increased from 0.798 to 0.932 with the proposed environment-aware model deployer; thus the quality of the images into an edge camera under changing weather can be effectively maintained.

    中文摘要 I Abstract II 致謝 IV 目錄 V 圖目錄 VIII 表格目錄 X 第一章 簡介 1 1.1 研究動機 1 1.2 文獻探討 3 1.2.1 影像強化神經網路之訓練架構 3 1.2.2深度神經網路之邊緣運算加速 7 1.2.3邊緣輸入影像的概念飄移偵測與應對 8 1.3 本研究貢獻與文章架構 10 第二章 系統設計理念與架構簡介 12 第三章 基於相對性對抗網路的影像強化訓練框架 14 3.1 相對性對抗網路訓練框架 14 3.2 統計性損失評估模組 15 3.3 感知性損失評估模組 16 3.4 生成對抗網路與其收斂不穩定之現象 18 3.5 平均相對性生成對抗網路 19 3.6 判別網路的結構與正規化 20 3.7 訓練框架的運作流程與實行細節 23 第四章 輕量影像強化殘差網路 26 4.1 分離卷積運算 26 4.2 倒置殘差模塊 28 4.3 輕量影像強化殘差網路之結構設計 30 4.4 網路之運算需求分析 31 4.5 網路結構優化模組 33 第五章 環境感知模型佈署模組 35 5.1 模組概觀 35 5.2 環境分析器 36 5.3 概念飄移偵測器 38 5.4 模型管理器與儲存庫 39 第六章 實驗結果與討論 41 6.1 實驗平台 41 6.2 實驗資料集與影像品質評估指標 41 6.3 訓練框架之損失組合實驗 43 6.3.1 基礎損失實驗 43 6.3.2 複合損失實驗 46 6.3.3 客觀使用者體驗評估 50 6.3.4 實驗討論 52 6.4 輕量影像強化殘差網路之結構實驗 52 6.4.1 寬度乘數實驗 53 6.4.2 模塊乘數實驗 55 6.4.3 實驗討論 57 6.5 環境感知模型佈署模組之設定實驗 58 6.5.1 環境分析器之訓練與驗證 58 6.5.2 取樣頻率與靈敏度實驗 59 6.5.4 場景分類策略實驗 62 6.5.5 實驗討論 63 6.6 影像強化應用示範 63 6.7 相關研究比較 66 第七章 結論與未來研究方向 69 參考文獻 71 發表著作與作品列表 75 口試委員之建議與回覆 76

    [1] S. Pouyanfar et al., "A Survey on Deep Learning: Algorithms, Techniques, and Applications," ACM Computing Surveys (CSUR), vol. 51, no. 5, p. 92, 2018.
    [2] G. Ananthanarayanan et al., "Real-time video analytics: The killer app for edge computing," computer, vol. 50, no. 10, pp. 58-67, 2017.
    [3] X. Fu, J. Huang, X. Ding, Y. Liao, and J. Paisley, "Clearing the skies: A deep network architecture for single-image rain removal," IEEE Transactions on Image Processing, vol. 26, no. 6, pp. 2944-2956, 2017.
    [4] X. Fu, J. Huang, D. Zeng, Y. Huang, X. Ding, and J. Paisley, "Removing rain from single images via a deep detail network," in IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1715-1723.
    [5] L. Shen, Z. Yue, Q. Chen, F. Feng, and J. Ma, "Deep joint rain and haze removal from single images," arXiv preprint arXiv:1801.06769, 2018.
    [6] B. Cai, X. Xu, K. Jia, C. Qing, and D. Tao, "Dehazenet: An end-to-end system for single image haze removal," IEEE Transactions on Image Processing, vol. 25, no. 11, pp. 5187-5198, 2016.
    [7] W. Ren, S. Liu, H. Zhang, J. Pan, X. Cao, and M.-H. Yang, "Single image dehazing via multi-scale convolutional neural networks," in European conference on computer vision, 2016, pp. 154-169: Springer.
    [8] B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng, "Aod-net: All-in-one dehazing network," in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 4770-4778.
    [9] S. Li, W. Ren, J. Zhang, J. Yu, and X. Guo, "Fast Single Image Rain Removal via a Deep Decomposition-Composition Network," arXiv preprint arXiv:1804.02688, 2018.
    [10] W. Yang, R. T. Tan, J. Feng, J. Liu, Z. Guo, and S. Yan, "Deep joint rain detection and removal from a single image," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 1357-1366.
    [11] X. Fu, B. Liang, Y. Huang, X. Ding, and J. Paisley, "Lightweight Pyramid Networks for Image Deraining," arXiv preprint arXiv:1805.06173, 2018.
    [12] Z. Fan, H. Wu, X. Fu, Y. Hunag, and X. Ding, "Residual-Guide Feature Fusion Network for Single Image Deraining," arXiv preprint arXiv:1804.07493, 2018.
    [13] G. Li, X. He, W. Zhang, H. Chang, L. Dong, and L. Lin, "Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining," arXiv preprint arXiv:1808.01491, 2018.
    [14] X.-J. Mao, C. Shen, and Y.-B. Yang, "Image restoration using convolutional auto-encoders with symmetric skip connections," arXiv preprint arXiv:1606.08921, 2016.
    [15] J. Johnson, A. Alahi, and L. Fei-Fei, "Perceptual losses for real-time style transfer and super-resolution," in European Conference on Computer Vision, 2016, pp. 694-711: Springer.
    [16] I. Goodfellow et al., "Generative adversarial nets," in Advances in neural information processing systems, 2014, pp. 2672-2680.
    [17] H. Zhang, V. Sindagi, and V. M. Patel, "Image de-raining using a conditional generative adversarial network," arXiv preprint arXiv:1701.05957, 2017.
    [18] R. Li, J. Pan, Z. Li, and J. Tang, "Single Image Dehazing via Conditional Generative Adversarial Network," methods, vol. 3, p. 24, 2018.
    [19] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized neural networks: Training neural networks with weights and activations constrained to+ 1 or− 1," arXiv preprint arXiv:1602.02830, 2016.
    [20] S. Han, J. Pool, J. Tran, and W. Dally, "Learning both weights and connections for efficient neural network," in Advances in neural information processing systems, 2015, pp. 1135-1143.
    [21] V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, "Speeding-up convolutional neural networks using fine-tuned cp-decomposition," arXiv preprint arXiv:1412.6553, 2014.
    [22] G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," arXiv preprint arXiv:1503.02531, 2015.
    [23] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4510-4520.
    [24] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang, "Deep laplacian pyramid networks for fast and accurate superresolution," in IEEE Conference on Computer Vision and Pattern Recognition, 2017, vol. 2, no. 3, p. 5.
    [25] A. Tsymbal, "The problem of concept drift: definitions and related work," Computer Science Department, Trinity College Dublin, vol. 106, no. 2, 2004.
    [26] J. Gama, I. Žliobaitė, A. Bifet, M. Pechenizkiy, and A. Bouchachia, "A survey on concept drift adaptation," ACM computing surveys (CSUR), vol. 46, no. 4, p. 44, 2014.
    [27] I. Žliobaitė, "Learning under concept drift: an overview," arXiv preprint arXiv:1010.4784, 2010.
    [28] S. Muthukrishnan, E. van den Berg, and Y. Wu, "Sequential change detection on data streams," in Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on, 2007, pp. 551-550: IEEE.
    [29] E. Ikonomovska, J. Gama, and S. Džeroski, "Learning model trees from evolving data streams," Data mining and knowledge discovery, vol. 23, no. 1, pp. 128-168, 2011.
    [30] R. Klinkenberg and I. Renz, "Adaptive information filtering: Learning in the presence of concept drifts," Learning for Text Categorization, pp. 33-40, 1998.
    [31] J. B. Gomes, E. Menasalvas, and P. A. Sousa, "Learning recurring concepts from data streams with a context-aware ensemble," in Proceedings of the 2011 ACM symposium on applied computing, 2011, pp. 994-999: ACM.
    [32] A. Dries and U. Rückert, "Adaptive concept drift detection," Statistical Analysis and Data Mining: The ASA Data Science Journal, vol. 2, no. 5‐6, pp. 311-327, 2009.
    [33] I. Adä and M. R. Berthold, "EVE: a framework for event detection," Evolving systems, vol. 4, no. 1, pp. 61-70, 2013.
    [34] J. Gama, P. Medas, G. Castillo, and P. Rodrigues, "Learning with drift detection," in Brazilian symposium on artificial intelligence, 2004, pp. 286-295: Springer.
    [35] R. Klinkenberg and T. Joachims, "Detecting Concept Drift with Support Vector Machines," in ICML, 2000, pp. 487-494.
    [36] J. M. Carmona-Cejudo, M. Baena-García, J. del Campo-Avila, R. Morales-Bueno, and A. Bifet, "Gnusmail: Open framework for on-line email classification," 2011.
    [37] G. Hulten, L. Spencer, and P. Domingos, "Mining time-changing data streams," in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 97-106: ACM.
    [38] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
    [39] H. Zhao, O. Gallo, I. Frosio, and J. Kautz, "Loss functions for image restoration with neural networks," IEEE Transactions on Computational Imaging, vol. 3, no. 1, pp. 47-57, 2017.
    [40] Z. Wang, E. P. Simoncelli, and A. C. Bovik, "Multiscale structural similarity for image quality assessment," in The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, 2003, vol. 2, pp. 1398-1402: Ieee.
    [41] L. Zhang, L. Zhang, X. Mou, and D. Zhang, "A comprehensive evaluation of full reference image quality assessment algorithms," in 2012 19th IEEE International Conference on Image Processing, 2012, pp. 1477-1480: IEEE.
    [42] X. Wang et al., "ESRGAN: Enhanced super-resolution generative adversarial networks," arXiv preprint arXiv:1809.00219, 2018.
    [43] M. Arjovsky and L. Bottou, "Towards principled methods for training generative adversarial networks," arXiv preprint arXiv:1701.04862, 2017.
    [44] A. B. L. Larsen, S. K. Sønderby, H. Larochelle, and O. Winther, "Autoencoding beyond pixels using a learned similarity metric," arXiv preprint arXiv:1512.09300, 2015.
    [45] D. Berthelot, T. Schumm, and L. Metz, "BEGAN: boundary equilibrium generative adversarial networks," arXiv preprint arXiv:1703.10717, 2017.
    [46] M. Arjovsky, S. Chintala, and L. Bottou, "Wasserstein gan," arXiv preprint arXiv:1701.07875, 2017.
    [47] A. Jolicoeur-Martineau, "The relativistic discriminator: a key element missing from standard GAN," arXiv preprint arXiv:1807.00734, 2018.
    [48] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," arXiv preprint, 2017.
    [49] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," arXiv preprint arXiv:1502.03167, 2015.
    [50] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida, "Spectral normalization for generative adversarial networks," arXiv preprint arXiv:1802.05957, 2018.
    [51] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
    [52] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
    [53] H. Zhang and V. M. Patel, "Density-aware single image de-raining using a multi-stream dense network," arXiv preprint arXiv:1802.07412, 2018.
    [54] B. Li et al., "Benchmarking Single-Image Dehazing and Beyond," IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 492-505, 2019.
    [55] X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010, pp. 249-256.
    [56] C. Ledig et al., "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network," in CVPR, 2017, vol. 2, no. 3, p. 4.
    [57] W. Liu et al., "Ssd: Single shot multibox detector," in European conference on computer vision, 2016, pp. 21-37: Springer.
    [58] J.-Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2223-2232.
    [59] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, "Pruning filters for efficient convnets," arXiv preprint arXiv:1608.08710, 2016.
    [60] S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural computation, vol. 9, no. 8, pp. 1735-1780, 1997.

    無法下載圖示 全文公開日期 2024/08/13 (校內網路)
    全文公開日期 2024/08/13 (校外網路)
    全文公開日期 2024/08/13 (國家圖書館:臺灣博碩士論文系統)
    QR CODE