簡易檢索 / 詳目顯示

研究生: 黃勝安
Sheng-An Huang
論文名稱: 片段線性化暨緊密矩陣編碼於神經網路參數實現模型壓縮
Piecewise Linearization with Dense Matrix Encoding for Neural Networks Parameters to Achieve Model Compression
指導教授: 呂政修
Jenq-Shiou Leu
口試委員: 周承復
Cheng-Fu Chou
陳省隆
Hsing-Lung Chen
阮聖彰
Shanq-Jang Ruan
呂政修
Jenq-Shiou Leu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 35
中文關鍵詞: 深度神經網路壓縮線性化近似函數緊密矩陣編碼
外文關鍵詞: Deep neural network compression, linearization, approximating function, dense matrix encoding
相關次數: 點閱:233下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 深度神經網路被廣泛應用於各種領域上,如影像辨識、資料分析、自然語言處理等。 高準確率的模型往往伴隨著百萬數量級的參數,如何建構更深層的神經網路達到高準確率成了當前的研究熱潮。另一方面,將神經網路建立於資源有限的設備,如嵌入式系統、智慧型手機等支持智能運算的裝置是一大挑戰。 為了儲存較少的參數於這些設備,有效地模型壓縮成了不可或缺的環節。此外,有效的神經網路參數編碼演算法亦為減少儲存複雜度的一大重點。過去多數的研究著重於稀疏矩陣編碼而非緊密矩陣編碼,緊密矩陣編碼自此仍未被仔細地探討與研究,同時,稀疏矩陣編碼無法有效表達一緊密矩陣。為此,我們在本研究提出另一種壓縮演算法,此演算法線性化各層權重,接續儲存線性化權重的關鍵資訊於硬體。相對於稀疏矩陣編碼演算法,此研究可以達到更高的壓縮率於緊密矩陣上。實驗結果顯示片段線性化機制可達到至少2倍以上的壓縮率於VGG-16、Resnet152以及Densenet169。換言之,僅需約50%原本的硬體空間即可儲存神經網路參數。


    Deep neural networks have been applied to different applications, such as image recognition, data analysis, natural language processing, etc. A high accuracy model may contain millions of parameters. How to construct a model with deeper layers to achieve a high accuracy is the current research trend. On the other hand, putting a neural network model on a resource limited device, such as embedded systems or smartphones, to support intelligent computing is a big challenge. In order to store fewer parameters on these devices, an efficient model compression mode is highly desired. Moreover, an efficient encoding for compressed neural networks is also a critical point for reducing storage complexity. Most of previous studies focus on the sparse matrix encoding instead of the dense one. The dense matrix encoding methods have not been studied well. Meanwhile, sparse matrix encoding can not attain high compression rate when encoding a dense matrix. Toward this end, we proposed an alternative compression algorithm that linearizes the weights on each layer and then store linearized weights with critical information. For a dense matrix, our algorithm can achieve a higher compression rate than one using existed sparse matrix encoding. Experiment results indicate that our proposed piecewise linearization scheme can make VGG-16, Resnet152 ,and Densenet169 achieve at least two times of the compression rates. In other words, neural networks parameters can be stored in around 50% of the original space.

    Contents Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .viii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .ix List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 2 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 3.1 Linearization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 3.2 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 3.3 Parallel lines & Huffman coding . . . . . . . . . . . . . . . . . . . . . .9 4 Comprehensive Analysis of Compression Scheme . . . . . . . . . . . . . . . .15 4.1 Compression Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15 4.2 Time Complexity of Piecewise¬Linearization . . . . . . . . . . . . . . .16 4.3 Approximation Error . . . . . . . . . . . . . . . . . . . . . . . . . . . .17 5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .22 5.1 Impact of Parameters of the Piecewise Linearization . . . . . . . . . . . .22 5.1.1 Impact of the Thresholdp. . . . . . . . . . . . . . . . . . . . .22 5.1.2 Impact of the Thresholdpm. . . . . . . . . . . . . . . . . . . .23 5.1.3 Impact of the Thresholdpd. . . . . . . . . . . . . . . . . . . . .25 5.2 Ablation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25 5.3 Performance on ImageNet Classification . . . . . . . . . . . . . . . . . .26 5.3.1 VGG¬16 on ImageNet . . . . . . . . . . . . . . . . . . . . . . .26 5.3.2 Resnet152 on ImageNet . . . . . . . . . . . . . . . . . . . . . .27 5.3.3 Densenet169 on ImageNet . . . . . . . . . . . . . . . . . . . . .28 5.4 Comparison with modern matrix encoding methods . . . . . . . . . . . .29 6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .33

    [1]K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in2016IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016.
    [2]A. Krizhevsky, I. Sutskever, and G. Hinton, “Imagenet classification with deep convolutionalneural networks,”Neural Information Processing Systems, vol. 25, 01 2012.
    [3]K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale imagerecognition,”CoRR, vol. abs/1409.1556, 2014.
    [4]O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image captiongenerator,” in2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),pp. 3156–3164, 06 2015.
    [5]H.-Y. Hsieh, S.-A. Huang, and J.-S. Leu, “Implementing a real-time image captioning servicefor scene identification using embedded system,”MultimediaToolsandApplications, vol. 80,pp. 12525–12537, Mar 2021.
    [6]C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9, 2015.
    [7]H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. Graf, “Pruning filters for efficient conv nets,” 08 2016.
    [8]S. Wiedemann, K. R. Müller, and W. Samek, “Compact and computationally efficient rep-resentation of deep neural networks,” IEEE Transactions on Neural Networks and LearningSystems, vol. 31, no. 3, pp. 772–785, 2020.
    [9]S. Han, H. Mao, and W. Dally, “Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding,” 10 2016.
    [10]B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko,“Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2704–2713, 2018.
    [11]J. H. George B. Thomas, Maurice D. Weir,Thomas’ Calculus. Pearson, 14 ed., 2019.33
    [12]I. S. Duff, “A survey of sparse matrix research,” Proceedings of the IEEE, vol. 65, no. 4,pp. 500–535, 1977.
    [13]T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to Algorithms. TheMIT Press, 2 ed., 2001.
    [14]Y. Lecun, J. Denker, and S. Solla, “Optimal brain damage,” vol. 2, pp. 598–605, 01 1989.
    [15]S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural networks,” 06 2015.
    [16]S. Chen and Q. Zhao, “Shallowing deep networks: Layer-wise pruning based on feature representations,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41,no. 12, pp. 3048–3056, 2019.
    [17]X. Dai, H. Yin, and N. K. Jha, “Nest: A neural network synthesis tool based on a grow-and-prune paradigm,” IEEE Transactions on Computers, vol. 68, no. 10, pp. 1487–1497, 2019.
    [18]F. Tung and G. Mori, “Deep neural network compression by in-parallel pruning-quantization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42,no. 3, pp. 568–579, 2020.
    [19]C. Leng, H. Li, S. Zhu, and R. Jin, “Extremely low bit neural network: Squeeze the last bitout with ADMM,”CoRR, vol. abs/1707.09870, 2017.
    [20]I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, “Quantized neural networks: Training neural networks with low precision weights and activations,” Journal of Machine Learning Research, vol. 18, no. 1, pp. 6869–6898, 2017.
    [21]H. Li, Z. Xu, G. Taylor, and T. Goldstein, “Visualizing the loss landscape of neural nets,” 122017.
    [22]G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp. 2261–2269, 2017.
    [23]O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy,A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge,” International Journal of Computer Vision(IJCV), vol. 115, no. 3, pp. 211–252, 2015.34
    [24]K. P. Bogart, Introductory Combinatorics. USA: Saunders College Publishing, 2nd ed., 1989.
    [25]P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz, “Pruning convolutional neural networks for resource efficient inference,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings,OpenReview.net, 2017.

    無法下載圖示 全文公開日期 2026/08/12 (校內網路)
    全文公開日期 2026/08/12 (校外網路)
    全文公開日期 2026/08/12 (國家圖書館:臺灣博碩士論文系統)
    QR CODE