Study of Malware Detection Based on Deep Learning Algorithms｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	鍾昌霖 Paul Elijah Setiasabda
論文名稱：	Study of Malware Detection Based on Deep Learning Algorithms Study of Malware Detection Based on Deep Learning Algorithms
指導教授：	呂政修 Jenq-Shiou Leu
口試委員:	方文賢 Wen-Hsien Fang 陳郁堂 Yie-Tarng Chen 陳省隆 Hsing-Lung Chen
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2021
畢業學年度：	109
語文別：	英文
論文頁數：	57
中文關鍵詞：	Machine Learning 、Deep Learning 、Malware Detection 、Malware Images 、Convolutional Neural Network (CNN)
外文關鍵詞：	Machine Learning, Deep Learning, Malware Detection, Malware Images, Convolutional Neural Network (CNN)
相關次數：	點閱：241 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

This research aims to investigate the effectiveness of a few deep learning algorithms to detect malware based on malware images. Every year, there has been a huge increase of malwares found in numerous network systems. This adversely causes a lot of damage in terms of finance and privacy. Malware sophistication has only improved over the years where the creators would obfuscate the code which cannot be detected swiftly with signature-based and heuristic method, thus bringing the need of a new approach, where malware images method came up. Initially, malware data would be converted into malware images and then be fed into machine mearning or deep learning architectures. A few algorithms were trained and tested including the Multilayer Perceptron (MLP), the Convolutional Neural Network (CNN), the CNN Long-Short Term Memory (CNN-LSTM) and the CNN Support Vector Machines (CNN-SVM). This study presents the CNN as the most suitable for malware images with accuracy that can reach 97.3% in one dataset with training time of only 43 seconds. It demonstrates this approach to be the most suitable compared to the other methods.

CONTENTS
ABSTRACT    ii
ACKNOWLEDGEMENTS    iii
LIST OF FIGURES    v
LIST OF TABLES    vii
LIST OF EQUATIONS    viii
CHAPTER 1 INTRODUCTION    1
1    Research Background    1
2    Research Objectives    6
3    Research Scope and Limitations    6
4    Outline and Report    7

CHAPTER 2 RELATED WORKS    8
1      Malware Images    8
2      Machine Learning and Deep Learning    9
2.1     Convolutional Neural Network (CNN)    11
2.2     Multi-Layers Perceptron (MLP)    11
2.3     Long Short-Term Memory (LSTM)    12
2.4     Overfitting    13
2.5     Swish Activation Function    13
2.6     Optuna    14
2.7     Tensorflow 2    14
3      Evaluation Metrics    15
3.1     Training Time    15
3.2     Accuracy    15
3.3     F1-Score    15
4      Related Research    16


CHAPTER 3 PROPOSED METHOD    17
1    Data Collection    17
2    Data Pre-Processing    18
3    Model Training    18
3.1    CNN Architecture    18
3.3     CNN-LSTM    20
3.4     Daniel Gilbert’s CNN    20
4    Validation    21
5    Optimization    21

CHAPTER 4 EXPERIMENT AND RESULT    23
1      Dataset    23
2    Experiment Result    25
2.1     Malimg Dataset    25
2.2     Malevis Dataset    28

CHAPTER 5 EVALUATION AND DISCUSSION    30
1    Evaluation Result    30
1.1    Accuracy    30
1.2    F1-Score    31
1.3     Training time    34
2    Discussion    35

CHAPTER 6 CONCLUSION AND FUTURE RESEARCH    37
1     Conclusion    37
2    Future Research    37
REFERENCES    38
APPENDIX A : Precision    41
APPENDIX B : Recall    43

                                

1. J. Landage and P. M. P. Wankhade, “Malware and Malware Detection Techniques: A Survey,” International Journal of Engineering Research & Technology, vol. 2, no. 12, Nov. 2013.
2. J. J. Blount, “Adaptive rule-based malware detection employing learning classifier systems,” thesis, 2011.
3. R. Sharp, “An Introduction to Malware.” [Online]. Available: https://orbit.dtu.dk/files/4918204/malware.pdf. [Accessed: 15-Nov-2020]
4. Vinod, P., et al., Survey on Malware Detection Methods. 2009.
5. K. Mathur and S. Hiranwal, “A Survey on Techniques in Detection and Analyzing Malware Executables,” International Journal of Advanced Research in Computer Science and Software Engineering, vol. 3, no. 4, Apr. 2013.
6. Llauradó Daniel Gibert and Alonso Javier Béjar, “Convolutional neural networks for malware classification,” thesis, 2016.
7. Kaspersky, “What is a Botnet?,” www.kaspersky.com, 13-Jan-2021. [Online]. Available: https://www.kaspersky.com/resource-center/threats/botnet-attacks. [Accessed: 13-Jan-2021]
8. MalwareBytes, “State of Malware Report”, 2017. Available: https://www.malwarebytes.com/pdf/white-papers/stateofmalware.pdf
9. “The Evolution of Anti-Virus Software & How MSSPs Have Adapted,” Cerdant, 22-Jun-2020. [Online]. Available: https://www.cerdant.com/the-evolution-of-anti-virus-software-how-mssps-have-adapted/. [Accessed: 15-Nov-2020].
10. I. You and K. Yim, “Malware Obfuscation Techniques: A Brief Survey”, 2010 International Conference on Broadband, Wireless Computing, Communication and Applications, Fukuoka, Japan, November 4-6, 2010.
11. L. Nataraj, S. Karthikeyan, G. Jacob, and B. S. Manjunath. Malware images: Visualization and automatic classification, 2011.
12. “MaleVis: A Dataset for Vision Based Malware Recognition,” MaleVis Dataset Home Page. [Online]. Available: https://web.cs.hacettepe.edu.tr/~selman/malevis/. [Accessed: 20-Nov-2020].
13. Marketing and E. Corporation, “PyTorch vs TensorFlow in 2020: What You Should Know,” Exxact, 30-Jan-2020. [Online]. Available: https://blog.exxactcorp.com/pytorch-vs-tensorflow-in-2020-what-you-should-know-about-these-frameworks/. [Accessed: 18-Nov-2020].
14. Anil Thomas Nikos Karampatziakis, Jack Stokes and Mady Marinescu. Using file relationships in malware classification. Detection of Intrusions and Malware, and Vulnerability Assessment, 7591:1–20, 2013.
15. “Deep Learning Spreads,” Semiconductor Engineering, 06-Feb-2018. [Online]. Available: https://semiengineering.com/deep-learning-spreads/. [Accessed: 20-Nov-2020].
16. “Convolutional Neural Networks cheatsheet Star,” CS 230 - Convolutional Neural Networks Cheatsheet. [Online]. Available: https://stanford.edu/~shervine/teaching/cs-230/cheatsheet-convolutional-neural-networks. [Accessed: 20-Nov-2020].
17. J. Brownlee, “Crash Course On Multi-Layer Perceptron Neural Networks,” Machine Learning Mastery, 14-Aug-2020. [Online]. Available: https://machinelearningmastery.com/neural-networks-crash-course/. [Accessed: 20-Nov-2020].
18. “Understanding LSTM Networks,” Understanding LSTM Networks -- colah's blog. [Online]. Available: https://colah.github.io/posts/2015-08-Understanding-LSTMs/. [Accessed: 20-Nov-2020].
19. G.I. Webb, Overfitting, in: C. Sammut, G.I. Webb (Eds.), Encyclopedia of Machine Learning, Springer, Boston, 2010, p. 744. https://doi.org/10.1007/978-0-387-30164-8_623.
20. Quoc V. Le Prajit Ramachandran Barret Zoph. Swish: a Self-Gated activation function. 2017
21. “A hyperparameter optimization framework” Optuna. [Online]. Available: https://optuna.readthedocs.io/en/stable/. [Accessed: 20-Nov-2020].
22. “Why TensorFlow,” TensorFlow. [Online]. Available: https://www.tensorflow.org/about. [Accessed: 20-Nov-2020].
23. Konstantinos Kosmidis and Christos Kalloniatis. Machine Learning and Images for Malware Detection and Classification. In Proceedings of the 21st Pan-Hellenic Conference on Informatics, PCI 2017, New York, NY, USA, 2017. Association for Computing Machinery.
24. Md. Zabirul Islam, Md. Milon Islam, Amanullah Asraf, "A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images", Informatics in Medicine Unlocked, Volume 20, 2020, 100412, ISSN 2352-9148, https://doi.org/10.1016/j.imu.2020.100412.

簡易檢索 / 詳目顯示

相關論文