簡易檢索 / 詳目顯示

研究生: 高得欽
De-Qin Gao
論文名稱: 一種有效應用於監控影像的動態卷積神經網絡
An Effective Dynamic Convolutional Neural Network for Surveillance Video
指導教授: 阮聖彰
Shanq-Jang Ruan
口試委員: 阮聖彰
Shanq-Jang Ruan
林淵翔
Yuan-Hsiang Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 60
中文關鍵詞: 減少卷積網絡運算量優化CNN監控影像的卷積網絡動態卷積優化卷積網路架構
外文關鍵詞: Reduce convolutional network operations, Optimize CNN, convolutional networks for monitoring images
相關次數: 點閱:209下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來關於神經網路減少存儲空間和計算資源的研究與日俱增,而這些研究從理論研究到平台實現,都已經有了非常大的進展。本論文針對監控影像大多的場景具有高相似度的特性,提出一種優化卷積神經網路的方法,稱為動態卷積(Dynamic Convolution),相較於現有的技術,這種方法無需重新訓練權重和分析權重的重要性,可直接套用在現有的卷積神經網路架構,且能有效降低卷積運算量並具有較低的延遲。此方法藉由目標檢測(Single Shot MultiBox Detector[21])的應用,在十四部不同場景的監控影像進行測試,測試結果平均每張圖像減少39% FLOP(Floating-point operations, FLOP),且對精確度的影響只小於0.7%mAP(Mean Average Precision, mAP)。此外,本論文提出的動態卷積是針對卷積過程的優化,與現有的加速方法屬不同層面,可與現有的加速技術互相補償達到更進一步的加速。


    Recently, the number of neural networks researches on reducing storage space and computational resources has been increasing, and these studies have made great progress from theory to practice. This paper focuses on the high similarity which exists in the scene of a surveillance video and presents a method to optimize convolutional neural networks, called dynamic convolution. Compared with the existing technology, this method can directly be applied to the existing convolution neural network architecture without retraining and analyzing weights, and effectively reduce the computation of convolution. The experiment tests 14 surveillance videos with various scenes. This paper show that proposed method can reduce inference costs for detecting objects (Single Shot MultiBox Detector[21]) by up to 39% of FLOP while the effect of accuracy is lower than 0.7% mAP, the dynamic convolution proposed in this paper optimize convolutional neural networks in a specific aspect of the convolution process which is different from the existing acceleration methods. Therefore, the proposed method can be complementary to existing acceleration methods to further speed up the performance.

    目錄 第1章 緒論 1 1.1 前言 1 1.2 動機 1 1.3 論文目標 2 1.4 論文架構 3 第2章 相關文獻 4 2.1 有效模型設計 4 2.2 裁剪權重方法 9 2.3 權重量化 10 第3章 動態卷積設計與介紹 12 3.1.1神經網路 12 3.1.2卷積神經網路 13 3.2 動態卷積神經網路(Dynamic Convolution Neural Network) 15 3.2.1 輸入差異圖(Input Difference Map) 18 3.2.2內部差異圖(Inner Difference Map) 19 3.2.3 動態卷積(Dynamic Convolution) 21 第4章 實驗結果 27 4.1 測試環境 27 4.2 動態卷積模型圖 28 4.3幀差法的閾值 30 4.4 動態卷積準確度評估 33 4.4.1 靜態圖片準確度評估 33 4.4.2 監控影像準確度評估 35 4.5運算量評估 36 第5章 總結與探討 39 參考文獻 40 附錄 44

    [1] BSIA. British Security Industry Authority (BSIA) Survey; British Security Industry Association: London, UK, 2013.
    [2] O. Russakovsky, J. Deng, H. Su and J. Krause, “Imagenet large scale visual recognition challenge,” Int. J. Comput. Vis., vol. 115, no. 3, pp. 211-252, Dec. 2015.
    [3] H. Li, A. Kadav, H. Samet and H. P. Graf. “Pruning Filters for efficient convnets,” arXiv preprint arXiv:1608.08710, Aug. 2016.
    [4] Han, Song, Jeff Pool, John Tran, and William Dally. “Learning both Weights and Connections for Efficient Neural Network,” in Proc. 28th ACM Int. Conf. Neural Information Processing Systems, NIPS ’15, Montreal, Canada, Jun. 2015, pp. 1135–1143.
    [5] Han, Song, Huizi Mao, and William J Dally. “Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding,” arXiv preprint arXiv:1510.00149, Oct. 2015.
    [6] S. Ioffe and C. Szegedy. “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, Feb. 2015.
    [7] F. N. Iandola, M. W. Moskewicz and K. Keutzer. ”Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 1mb model size,” arXiv preprint arXiv:1602.07360, Feb. 2016
    [8] A. Howard, M. Zhu, M. Andreetto, and H. Adam. “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” arXiv preprint arXiv:1704.04861, Apr 2017
    [9] D. S. Suresh, and M. P. Lavanya, “Motion Detection and Tracking using Background Subtraction and Consecutive Frames Difference Method,” International Journal of Research Studies in Science Engineering and Technology, vol. 1, no. 5, pp. 16-22, Aug. 2014.
    [10] D. Soudry, I. Hubara, and R. Meir. “Expectation backpropagation: Parameter-free training of multilayer neural networks with continuous or discrete weights,” In Advances in Neural Information Processing Systems(NIPS), vol. 1, pp. 963–971, Dec. 2014.
    [11] J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng. “Quantized convolutional neural networks for mobile devices,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4820–4828, Dec. 2016.
    [12] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi. “Xnor net: Imagenet classification using binary convolutional neural networks. In European Conference on Computer Vision,” arXiv preprint arXiv:1603.05279, Mar. 2016.
    [13] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. “Rethinking the inception architecture for computer vision,” arXiv preprint arXiv:1512.00567, Mar. 2016.
    [14] X. Zhang, X. Zhou, L. Mengxiao, and J. Sun. “Shufflenet: An extremely efficient convolutional neural network for mobile devices,” arXiv preprint arXiv:1707.01083, Jul. 2017.
    [15] Y.-D. Kim, E. Park, S. Yoo, T. Choi, L. Yang, and D. Shin. “Compression of deep convolutional neural networks for fast and low power mobile applications,” arXiv preprint arXiv:1511.06530, Nov. 2015.
    [16] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang. “Learning efficient convolutional networks through network slimming,” in The IEEE International Conference on Computer Vision (ICCV), Oct. 2017.
    [17] J. Yang, Y.-H. Chen, and V. Sze, “Designing energy-efficient convolutional neural networks using energy-aware pruning,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit, 2017.
    [18] S. zegedy, C. Liu, W. Jia, Y. Sermanet and Rabinovich, “A. Going deeper with convolutions,” arXiv preprint arXiv: 1409.4842, Sep. 2014.
    [19] Han, Song, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. “Eie: Efficient inference engine on compressed deep neural network,” International Symposium on Computer Architecture (ISCA), 2016.
    [20] Chen, Y., Luo, T., Liu, S., Zhang, S., He, L., Wang, J., Li, L., Chen, T., Xu, Z., Sun, N., et al. “Dadiannao: A machine-learning supercomputer,” in 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 609–622, Dec. 2014.
    [21] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. Reed.” Ssd: Single shot multibox detector,” arXiv preprint arXiv: 1512.02325, Dec. 2016.
    [22] F. N. Iandola, M. W. Moskewicz, K. Ashraf, S. Han, W. J. Dally, and K. Keutzer. “Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 1mb model size,” arXiv preprint arXiv: 1602.07360, 2016.
    [23] Courbariaux, M., Bengio, Y., and David, J.-P. “Training deep neural networks with low precision multiplications,” ArXiv e-prints, abs/1412.7024, Dec. 2014
    [24] Sajid Anwar, Kyuyeon Hwang, and Wonyong Sung. “Structured Pruning of Deep Convolutional Neural Networks,” arXiv preprint arXiv:1512.08571, 2015.
    [25] Adam Polyak and Lior Wolf. “Channel-Level Acceleration of Deep Face Representations,” in IEEE Access, vol.3, pp. 2163 – 2175, Oct. 2015.
    [26] M. Horowitz. Energy table for 45nm process, Stanford VLSI wiki. [Online]. Available: https://sites.google.com/site/seecproject
    [27] K. Fukushima. “Neocognitron: A Self-Organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position," Biological Cybernetics, vol. 36, pp. 193-202, 1980.
    [28] Available :https://github.com/weiliu89/caffe/tree/ssd
    [29] i-LIDS Dataset for AVSS 2007, ftp://motinas.elec.qmul.ac.uk/pub/iLids

    QR CODE