簡易檢索 / 詳目顯示

研究生: 王奕升
Yi-Sheng Wang
論文名稱: 利用特徵濾波器分析的卷積神經網路壓縮方法
Compressing Convolutional Neural Networks Using Eigenfilters Analysis
指導教授: 王乃堅
Nai-Jian Wang
口試委員: 郭景明
Jing-Ming Guo
呂學坤
Shyue-Kung Lu
鍾順平
Shun-Ping Chung
曾德峰
Der-Feng Tseng
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 57
中文關鍵詞: 卷積神經網路主成分分析濾波器剪枝
外文關鍵詞: Convolutional Neural Network, Principal Component Analysis, Filter Pruning
相關次數: 點閱:208下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著深度學習領域不斷的進步,計算機視覺領域中的卷積神經網路架構比起以往擁有更多的記憶體使用量和更複雜的運算需求,相對地對於硬體的要求也就更高,導致將深度學習的應用部屬到移動式裝備的困難度逐年在增加,這使得如何在有限的記憶體和硬體效能中擁有差不多的辨識準確率成為近幾年被關注的問題之一。

    因此,本論文提出了一個新穎的卷積神經網路的壓縮方法。此方法首先利用主成分分析分解卷積層中的濾波器,再藉由分解的結果,設計一個用來重建濾波器和輸入圖片的卷積運算架構。最後,利用剪枝的技術對整個卷積神經網路進行壓縮,並且在壓縮後透過微調的方式將模型恢復到壓縮前的辨識能力。

    此外,透過實驗證明,將本論文提出的壓縮方法與最先進的剪枝方法相比能在模型的整體效能上帶來顯著的進步。以ResNet-34 為例,提出的方法可以使整個模型在CIFAR-10 數據集上獲得6.9倍的壓縮率,並且只損失百分之0.7的辨識準確率。


    Convolutional Neural Network (CNN) has became the dominant solution for a variety of computer vision problems. However, its high demands in computing power and memory storage make it difficult to deploy on portable devices with limited hardware resources.

    In this thesis, a novel model compression method is proposed. For each convolution layer in a pre-trained CNN model, our method first applies Principal Component Analysis (PCA) to decompose the learned filter set of the convolution layer into a linear combination of basis filters and its corresponding coordinates. Next, we replace the original layer to a filter reconstruction layer with two filter sets given by this linear combination. After the first two steps we prune filters in the filter reconstruction layer and retrain the network to fine tune the remaining filters.

    Besides, we experimentally show that the proposed method compared to the state-of-the-art pruning method leads to significant improvements. For the ResNet-34 model, our method achieves a whole-model compression ratio of 6.9× with a 0.7 percent drop of accuracy in CIFAR-10 dataset.

    摘要 .................................................. I Abstract .............................................. II Table of Contents ..................................... III List of Figures ....................................... V List of Tables ........................................ VI 1 Introduction ........................................ 1 1.1 Background ........................................ 1 1.2 Motivation ........................................ 2 1.3 Contributions ..................................... 2 1.4 Thesis Organization ............................... 3 2 Related Works ....................................... 4 2.1 Convolutional Neural Network ...................... 4 2.2 Model Compression and Acceleration ................ 6 2.2.1 Parameter Pruning ............................... 6 2.2.2 Low-Rank Factorization .......................... 7 2.2.3 Quantization and Binarization ................... 8 2.2.4 Knowledge Distillation .......................... 9 3 Proposed Method ..................................... 10 3.1 Dimensionality Reduction of Convolution Layer ..... 10 3.1.1 Principal Component Analysis .................... 10 3.1.2 PCA for Convolution Layer ....................... 12 3.2 Layer Design of Filter Reconstruction ............. 15 3.3 Filter Pruning for Reconstruction Layer ........... 18 3.3.1 Overall Strategy ................................ 18 3.3.2 Filter Scoring .................................. 19 3.3.3 Filter Pruning .................................. 19 3.3.4 Pruning Sensitivity ............................. 25 3.3.5 Fine-Tuning ..................................... 27 4 Experiments ......................................... 28 4.1 Experimental Setting .............................. 28 4.1.1 Environment ..................................... 28 4.1.2 Datasets ........................................ 28 4.1.3 Architectures ................................... 30 4.1.4 Fine-Tuning Setup ............................... 33 4.2 Experimental Results .............................. 34 4.2.1 LeNet on MNIST .................................. 34 4.2.2 VGGNet on CIFAR-10 .............................. 35 4.2.3 ResNet on CIFAR-10 .............................. 36 4.2.4 ResNet on CIFAR-100 ............................. 38 5 Conclusions and Future Works ........................ 40 5.1 Conclusions ....................................... 40 5.2 Future Works ...................................... 41 References ............................................ 42

    [1] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1–9.
    [2] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 580–587.
    [3] E. Shelhamer, J. Long, and T. Darrell, “Fully convolutional networks for semantic segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 39, no. 4, pp. 640–651, 2017.
    [4] K. Zhang, W. Zuo, S. Gu, and L. Zhang, “Learning deep cnn denoiser prior for image restoration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 3929–3938.
    [5] X. Zhang, R. Ng, and Q. Chen, “Single image reflection separation with perceptual losses,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 4786–4794.
    [6] W.-S. Lai, J.-B. Huang, N. Ahuja, and M.-H. Yang, “Deep laplacian pyramid networks for fast and accurate super-resolution,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5835–5843.
    [7] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2012, pp. 1097–1105.
    [8] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large ScaleVisual Recognition Challenge,” International Journal of Computer Vision (IJCV), vol. 115, no. 3, pp. 211–252, 2015.
    [9] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779–788.
    [10] Y. Cheng, D. Wang, P. Zhou, and T. Zhang, “Model compression and acceleration for deep neural networks: The principles, progress, and challenges,” IEEE Signal Processing Magazine, vol. 35, no. 1, pp. 126–136, 2018.
    [11] S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” in International Conference on Learning Representations (ICLR), 2016.
    [12] D. Blalock, J. J. G. Ortiz, J. Frankle, and J. Guttag, “What is the state of neural network pruning?” in Conference on Machine Learning and Systems (MLSys), 2020.
    [13] Y. He, X. Zhang, and J. Sun, “Channel pruning for accelerating very deep neural networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2017, pp. 1389–1397.
    [14] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
    [15] D. Rumelhart, G. Hinton, and R. Williams, “Learning representations by backpropagating errors,” Nature, vol. 323, pp. 533–536, 1986.
    [16] S. Liu and W. Deng, “Very deep convolutional neural network based image classification using small training sample size,” in Proceedings of the IAPR Asian Conference on Pattern Recognition (ACPR), 2015, pp. 730–734.
    [17] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
    [18] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proceedings of the International Conference on International Conference on Machine Learning (ICML), 2015, pp. 448–456.
    [19] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2017, pp. 5998–6008.
    [20] Y. LeCun, J. S. Denker, and S. A. Solla, “Optimal brain damage,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 1990, pp.598–605.
    [21] B. Hassibi, D. G. Stork, and G. J.Wolff, “Optimal brain surgeon and general network pruning,” in Proceedings of the IEEE International Conference on Neural Networks, 1993, pp. 293–299.
    [22] S. Han, J. Pool, J. Tran, and W. J. Dally, “Learning both weights and connections for efficient neural networks,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2015, pp. 1135–1143.
    [23] J. Frankle and M. Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural networks,” in International Conference on Learning Representations (ICLR), 2019.
    [24] H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning filters for efficient convnets,” in International Conference on Learning Representations (ICLR), 2017.
    [25] Y. He, G. Kang, X. Dong, Y. Fu, and Y. Yang, “Soft filter pruning for accelerating deep convolutional neural networks,” in Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2018, pp. 2234–2240.
    [26] M. Lin, R. Ji, Y. Wang, Y. Zhang, B. Zhang, Y. Tian, and L. Shao, “Hrank: Filter pruning using high-rank feature map,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 1529–1538.
    [27] Y. He, P. Liu, Z.Wang, Z. Hu, and Y. Yang, “Filter pruning via geometric median for deep convolutional neural networks acceleration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4340–4349.
    [28] Z. Liu, P. Wang, and Z. Li, “More-similar-less-important: Filter pruning via kmeans clustering,” in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), 2021, pp. 1–6.
    [29] W. Kim, S. Kim, M. Park, and G. Jeon, “Neuron merging: Compensating for pruned neurons,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2020, pp. 585–595.
    [30] Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang, “Learning efficient convolutional networks through network slimming,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2017, pp. 2736–2744.
    [31] C. Zhao, B. Ni, J. Zhang, Q. Zhao, W. Zhang, and Q. Tian, “Variational convolutional neural network pruning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2780–2789.
    [32] T. Zhuang, Z. Zhang, Y. Huang, X. Zeng, K. Shuang, and X. Li, “Neuron-level structured pruning using polarization regularizer,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2020, pp. 9865–9877.
    [33] S. Lin, R. Ji, C. Yan, B. Zhang, L. Cao, Q. Ye, F. Huang, and D. Doermann, “Towards optimal structured cnn pruning via generative adversarial learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 2790–2799.
    [34] X. Suau, N. Apostoloff et al., “Filter distillation for network compression,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 3129–3138.
    [35] R. Rigamonti, A. Sironi, V. Lepetit, and P. Fua, “Learning separable filters,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2013, pp. 2754–2761.
    [36] M. Jaderberg, A. Vedaldi, and A. Zisserman, “Speeding up convolutional neural networks with low rank expansions,” in Proceedings of the British Machine Vision Conference (BMVC), 2014.
    [37] E. L. Denton, W. Zaremba, J. Bruna, Y. LeCun, and R. Fergus, “Exploiting linear structure within convolutional networks for efficient evaluation,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2014, pp. 1269–1277.
    [38] V. Lebedev, Y. Ganin, M. Rakhuba, I. Oseledets, and V. Lempitsky, “Speeding-up convolutional neural networks using fine-tuned cp-decomposition,” in International Conference on Learning Representations (ICLR), 2015.
    [39] L. F. Brillet, S. Mancini, S. Cleyet-Merle, and M. Nicolas, “Tunable cnn compression through dimensionality reduction,” in Proceedings of the IEEE International Conference on Image Processing (ICIP), 2019, pp. 3851–3855.
    [40] Y. Gong, L. Liu, M. Yang, and L. Bourdev, “Compressing deep convolutional networks using vector quantization,” in International Conference on Learning Representations (ICLR), 2015.
    [41] J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng, “Quantized convolutional neural networks for mobile devices,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 4820–4828.
    [42] V. Vanhoucke, A. Senior, and M. Z. Mao, “Improving the speed of neural networks on cpus,” in Proceedings of the Conference on NeurIPS Deep Learning and Unsupervised Feature Learning Workshop, 2011.
    [43] C. Zhu, S. Han, H. Mao, and W. J. Dally, “Trained ternary quantization,” in International Conference on Learning Representations (ICLR), 2016.
    [44] M. Courbariaux, Y. Bengio, and J.-P. David, “Binaryconnect: Training deep neural networks with binary weights during propagations,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2015, pp. 3123–3131.
    [45] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, and Y. Bengio, “Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or -1,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2016, pp. 3105–3113.
    [46] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, “Xnor-net: Imagenet classification using binary convolutional neural networks,” in Proceedings of the Conference on European Conference on Computer Vision (ECCV), 2016, pp. 525–542.
    [47] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “Inception-v4, inception-resnet and the impact of residual connections on learning,” in Proceedings of the 31th AAAI International Conference on Artificial Intelligence, 2017, pp. 4278–4284.
    [48] C. Bucilua, R. Caruana, and A. Niculescu-Mizil, “Model compression,” in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006, pp. 535–541.
    [49] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” in Proceedings of the Conference on NeurIPS Deep Learning and Representation Learning Workshop, 2015.
    [50] A. Romero, N. Ballas, S. E. Kahou, A. Chassang, C. Gatta, and Y. Bengio, “Fitnets: Hints for thin deep nets,” in International Conference on Learning Representations (ICLR), 2014.
    [51] S. Zagoruyko and N. Komodakis, “Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer,” in International Conference on Learning Representations (ICLR), 2017.
    [52] I. Jolliffe and Springer-Verlag, Principal Component Analysis. Springer, 2002.
    [53] G. Golub and C. Van Loan, Matrix Computations. Johns Hopkins University Press, 1996.
    [54] T. G. Kolda and B. W. Bader, “Tensor decompositions and applications,” SIAM review, vol. 51, no. 3, pp. 455–500, 2009.
    [55] Y. LeCun and C. Cortes, “Mnist handwritten digit database,” 2010. [Online]. Available: http://yann.lecun.com/exdb/mnist/
    [56] A. Krizhevsky, “Learning multiple layers of features from tiny images,” Tech. Rep., 2009.
    [57] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in International Conference on Learning Representations (ICLR), 2017.
    [58] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “Pytorch: An imperative style, high-performance deep learning library,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2019, pp. 8024–8035.

    QR CODE