簡易檢索 / 詳目顯示

研究生: SETYA WIDYAWAN PRAKOSA
SETYA WIDYAWAN PRAKOSA
論文名稱: Improving the Accuracy of Pruned Network Using Knowledge Distillation
Improving the Accuracy of Pruned Network Using Knowledge Distillation
指導教授: 呂政修
Jenq-Shiou Leu
口試委員: Hsing-Lung Chen
Hsing-Lung Chen
Yie-Tarng Chen
Yie-Tarng Chen
Wen-Shien Fang
Wen-Shien Fang
Jenq-Shiou Leu
Jenq-Shiou Leu
Ray-Guang Cheng
Ray-Guang Cheng
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 英文
論文頁數: 42
中文關鍵詞: Convolutional Neural Networks (CNN)compression techniquepruning filtersKnowledge Distillation (KD)accuracyinference time
外文關鍵詞: Convolutional Neural Networks (CNN), compression technique, pruning filters, Knowledge Distillation (KD), accuracy, inference time
相關次數: 點閱:287下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

The introduction of Convolutional Neural Networks (CNN) in image processing field has attracted researchers to explore the applications of CNN itself. Some network designs have been proposed to reach the state of the art capability. However, the current design of neural network remains an issue related to the size of the model. Thus, some researchers introduce to reduce or compress the model size.
The compression technique might affect the accuracy of the compressed model compared to the original one. In addition, it may influence the performance of the new model. Furthermore, we need to exploit a new scheme to enhance the accuracy of compressed network. In this study, we explore that Knowledge Distillation (KD) can be integrated to one of pruning methodologies namely pruning filters, as the compression technique, to enhance the accuracy of pruned model.
From all experimental results, we conclude that incorporating KD to create a MobileNets model can enhance the accuracy of pruned network without elongating the inference time. We measured the inference time of model trained with KD is just 0.1s longer than that of without KD. Furthermore, by reducing 26.08% of the model size, the accuracy without KD is 63.65% and by incorporating KD, we can enhance to 65.37%.
By reducing the size of model using pruning filters, we can deduct the size while the original size of MobileNets is 14.4 MB and reducing 26.08% can decrease the size to 11.3 MB. We also save 0.1 s inference time by compressing the size of model.


The introduction of Convolutional Neural Networks (CNN) in image processing field has attracted researchers to explore the applications of CNN itself. Some network designs have been proposed to reach the state of the art capability. However, the current design of neural network remains an issue related to the size of the model. Thus, some researchers introduce to reduce or compress the model size.
The compression technique might affect the accuracy of the compressed model compared to the original one. In addition, it may influence the performance of the new model. Furthermore, we need to exploit a new scheme to enhance the accuracy of compressed network. In this study, we explore that Knowledge Distillation (KD) can be integrated to one of pruning methodologies namely pruning filters, as the compression technique, to enhance the accuracy of pruned model.
From all experimental results, we conclude that incorporating KD to create a MobileNets model can enhance the accuracy of pruned network without elongating the inference time. We measured the inference time of model trained with KD is just 0.1s longer than that of without KD. Furthermore, by reducing 26.08% of the model size, the accuracy without KD is 63.65% and by incorporating KD, we can enhance to 65.37%.
By reducing the size of model using pruning filters, we can deduct the size while the original size of MobileNets is 14.4 MB and reducing 26.08% can decrease the size to 11.3 MB. We also save 0.1 s inference time by compressing the size of model.

ABSTRACT i ACKNOWLEDGEMENTS iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii LIST OF EQUATIONS viii CHAPTER 1 INTRODUCTION 1 1.1 Research Background 1 1.2 Objective 3 1.3 Research Scope and Constraint 3 1.4 Outline and Report 4 CHAPTER 2 LITERATURE REVIEW 6 2.1 Existing Methodology on Model Compression 6 2.1.1 Knowledge Distillation 7 2.1.2 Pruning Filters 9 2.2 Neural Network Architecture 10 2.2.1 LeNet-5 11 2.2.2 AlexNet 11 2.2.3 MobileNets 12 2.2.4 AlexNet 13 CHAPTER 3 METHODOLOGY 14 3.1 Preliminary Study 15 3.2 Proposed Scheme 17 3.2.1 Architecture 17 3.2.2 How can we prune? 19 3.3 Environment settings 20 CHAPTER 4 RESULTS AND DISCUSSION 21 4.1 Preliminary Study 21 4.2 Proposed Scheme 22 4.2.1 Accuracy of proposed scheme 22 4.2.2 Inference time and model size 25 4.2.3 Retraining to recover dropped accuracy 26 CHAPTER 5 CONCLUSION AND FUTURE WORKS 27 5.1 Conclusion 27 5.2 Future Works 27 REFERENCES 28

1. A. Manyala, H. Cholakkal, V. Anand, V. Kanhangad, and D. Rajan, “CNN-based Gender Classification in Near-Infrared Periocular Images,” Pattern Analysis and Applications, 2018.
2. M. Zhang, C. Gao, Q. Li, L. Wang, and J. Zhang, “Action detection based on tracklets with the two-stream CNN,” Multimedia Tools and Applications, pp 3303-3316, 2018.
3. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing Systems, Vol. 1, Pages 1097-1105, 2012.
4. K. Simonyan, A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition,” International Conference on Learning Representations (ICLR), 2015.
5. K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” CoRR, abs/1512.03385, 2015.
6. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going Deeper with Convolutions,” CoRR, abs/1409.4842, 2014.
7. C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” CoRR, abs/1512.00567, 2015.
8. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, “Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning,” CoRR, abs/1602.07261, 2016.
9. G. Hinton, O. Vinyals, and J. Dean, “Distilling the Knowledge in a Neural Network,” NIPS 2014 Deep Learning Workshop, NIPS, 2014.
10. L. J. Ba, and R. Caruana, “Do Deep Nets Really Need to be Deep?” Advances in Neural Information Processing System 27, NIPS, 2014.
11. A. Aghasi, A. Abdi, N. Nguyen, and J. Romberg, “Net-Trim: Convex Pruning of Deep Neural Networks with Performance Guarantee,” 31st Conference on Neural Information Processing Systems, NIPS, 2017.
12. X. Dong, S. Chen, and S. J. Pan, “Learning to Prune Deep Neural Networks via Layer-wise Optimal Brain Surgeon,” CoRR. abs/1705.07565, 2017.
13. H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf, “Pruning Filters for Efficient ConvNets,” Proceedings of NIPS Workshop on Efficient Methods for Deep Neural Networks, 2016.
14. Y. Cheng, D. Wang, and P. Zhou, “A Survey of Model Compression and Acceleration for Deep Neural Networks,” IEEE Signal Processing Magazine, Special Issue on Deep Learning for Image Understanding, 2017.
15. S. J. Hanson and L.Y. Pratt, “Comparing Biases for Minimal Network Construction with Back-Propagation,” in Advances in Neural Information Processing System (NIPS) 1, D. S. Touretzky, Ed., 1989, pp. 177-185.
16. Y. L. Cun, J. S. Denker, and S. A. Solla, “Advances in Neural Information Processing System 2,” D. S. Touretzky, Ed. San Fransisco, CA, USA: Morgan Kaufmann Publishers Inc., 1990, ch. Optimal Brain Damage, pp. 598-605.
17. B. Hassibi, D. G. Stork, and S. C. R. Com, “Second Order Derivatives for Network Pruning: Optimal Brain Surgeon,” in Advances in Neural Information Processing Systems 5. Morgan Kaufmann, 1993, pp. 164-171.
18. C. Bucilua, R. Caruana, and A Niculescu-Mizil, “Model Compression” in Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD ‘06. New York, NY, USA: ACM, 2006, pp 535-541.
19. Y. LeCun, L. Bottou, and Y. Bengio, “Gradient-based Learning Applied to Document Recognition,” Proceedings of the IEEE, Nov. 1998, vol. 86, pp 2278-2324.
20. A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient Convolutional Neural Networks for Mobile Version Applications,” CoRR, abs/1704.04861.
21. F. Chollet, “Deep Learning with Depthwise Separable Convolutions,” CVPR, 2017.

QR CODE