簡易檢索 / 詳目顯示

研究生: 吳濬志
Chun-chih Wu
論文名稱: 基於動態行為的惡意程式偵測
Malware Detection Based on Dynamic Behavior Analysis
指導教授: 鮑興國
Hsing-Kuo Pao
口試委員: 鄧惟中
Wei-Chung Teng
項天瑞
Tien-Ruey Hsiang
李育杰
Yuh-Jye Lee
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 44
中文關鍵詞: 不相似距離動態分析惡意程式偵測馬可夫鏈沙箱奇異值分解
外文關鍵詞: dissimilarity function, dynamic analysis, malware detection, Markov chain, sandbox, singular value decomposition.
相關次數: 點閱:195下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 惡意程式偵測預計今後幾年最重要的研究課題之一。近年來,惡意程式演變成不同形式,並以不同的形式入侵電腦。近年來開發虛擬化和雲計算等技術為偵測惡意程式提供一個新方法。惡意程式偵測的基本的方法,包括靜態分析和動態分析。既具有不同的方面和有效性:靜態分析可以離線完成,但可能無法處理代碼包裝,混淆等;另一方面,動態分析必須及時完成,我們希望找到更多的研究代碼或惡意程式的行為。另一方面,我們專注於動態分析用以偵測惡意程式,並且自行建立的沙盒,提供一個安全的虛擬化環境。在我們的系統架構中,我們使用馬可夫鏈分析惡意程式的動態行為;或者更具體地說,利用馬可夫鏈模擬系統的註冊檔和路徑的時空關係。我們使用不相似程度來形容惡意程式彼此之間的距離 比較其他各類偵測模型 我們研究的基於 SVM 的偵測以及奇異值分解偵測。。根據我們的實驗,我們的偵測率可達80%以上;更甚可低於 2%的錯誤率。


    Malware detection has been one of the most important research topics since the time we start to use computers and the situation is expected to continue for years to come. In recent years, malware evolves into different forms with differ- ent intrusion intentions and the newly developed technologies like virtualization and cloud computing give a new aspect of malware detection. The basic cate- gorization of malware detection consists of static analysis and dynamic analysis. Both has different aspects and effectiveness: the static analysis can be done of- fline, but may not deal with code packing, obfuscation, etc; on the other hand, the dynamic analysis must be done in real time and we expect to find more behaviors of the studied code or malware. In this work, we focus on using dynamic analysis to detect malware, based on a self-built sandbox that provides virtualization in a secure environment. In our framework, we model a code’s dynamic behavior by Markov chain; or more specifically, using Markov chain to model the temporal relationship of registries and paths of the system calls. After we catching the tem- poral relationship, we use a novel dissimilarity function to describe the pairwise distance between each pair of malwares or benign codes; and the detection is done by simply a nearest neighbor search. To compare to other kinds of detection mod- els, we study the SVM-based detection as well as the detection based on singular value decomposition. According to our evaluation, the best detection performance can reach 80% accuracy for the classification of codes into different kinds of mal- wares or benign class; or as low as 2% for the classification of codes into benign and malicious groups.

    1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . 2 1.3 Outline of the Thesis . . . . . . . . . . . . . . . . . . 2 2 Related Work . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Static Analysis . . . . . . . . . . . . . . . . . . . . . 3 2.2 Dynamic Analysis . . . . . . . . . . . . . . . . . . . . . 4 2.3 Sandbox . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.4 Malware Detection by Classification and Clustering . . . . 5 3 Framework . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1 Preprocess . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.1 Sandbox . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1.2 System Call . . . . . . . . . . . . . . . . . . . . . . . 9 3.2 Markov chain . . . . . . . . . . . . . . . . . . . . . . . 10 3.3 Support Vector Machine . . . . . . . . . . . . . . . . . . 11 3.4 Dissimilarity Function . . . . . . . . . . . . . . . . . . 12 3.5 Other Techniques . . . . . . . . . . . . . . . . . . . . . 13 3.5.1 Isomap . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.5.2 Singular Value Decomposition . . . . . . . . . . . . . . . 14 4 Data Set Description . . . . . . . . . . . . . . . . . . . 16 5 Experiment . . . . . . . . . . . . . . . . . . . . . . . 18 5.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . 19 5.2 Markov Chain . . . . . . . . . . . . . . . . . . . . . . . 19 5.2.1 Registry with System Path as Feature . . . . . . . . . . . 20 5.2.2 Registry as Feature only . . . . . . . . . . . . . . . . . 20 5.3 Multi-labeled SSVM . . . . . . . . . . . . . . . . . . . . 24 5.3.1 Registry with System Path as Feature . . . . . . . . . . . 24 5.3.2 Registry as Feature only . . . . . . . . . . . . . . . . . 25 5.4 Dimension Reduction Using Isomap . . . . . . . . . . . . . 28 5.5 Noise Removal Using SVD . . . . . . . . . . . . . . . . . 30 6 Conclusion and Future Work . . . . . . . . . . . . . . . . 32 6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . 32 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . 33 6.2.1 Advanced Persistent Threat Attack . . . . . . . . . . . . 33 6.2.2 Threat in Embedded System . . . . . . . . . . . . . . . . 35

    [1] Anubis. http://anubis.iseclab.org/.
    [2] Clamav. http://www.clamav.net/lang/en/.
    [3] Contagiodump. http://contagiodump.blogspot.tw/.
    [4] Cwsandbox. http://mwanalysis.org/.
    [5] Ida. http://www.hex-rays.com/products/ida/index.shtml.
    [6] Institute for information industry. http://www.iii.org.tw/.
    [7] Intel vt. http://www.intel.com/content/www/us/en/virtualization/intel-virtualization-transforms-it.html.
    [8] Iran’s nuclear agency trying to stop computer worm. http://wikileaks.org/gifiles/docs/185945 re-alpha-s3-g3-israel-iran-barak-hails-munitions-blast-in.html.
    [9] Offensivecomputing. http://www.offensivecomputing.net/.
    [10] Qemu. http://wiki.qemu.org/Main Page.
    [11] Stuxnet. http://www.webcitation.org/mainframe.php.
    [12] Xen. http://xen.org/.
    [13] Bitdefender: Bitdefender anti-virus technology. white paper. 2007.
    [14] Erin L. Allwein, Robert E. Schapire, and Yoram Singer. Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1:113–141, 2001.
    [15] Michael W. Berry, Susan T. Dumais, and Gavin W. O’Brien. Using linear algebra for intelligent information retrieval. Technical Report UT-CS-94-270, 1994.
    [16] Christopher J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discover, 2(2):121–167, 1998.
    [17] C. Willems P. Dussel K. Rieck, T. Holz and P. Laskov. Learning and classification of malware behavior. In Fifth. Conference on Detection of Intrusions and Malware and Vulnerability Assessment (DIMVA, 2008.
    [18] Y.J. Lee and O.L. Mangasarian. SSVM: A smooth support vector machine for classification. Computational optimization and Applications, 20(1):5–22, 2001.
    [19] P. Poosankam M. G. Kang and H. Yin. Renovo: A hidden code extractor for packed executables. WORM, 2007.
    [20] M. Kotter P. Bacher, T. Holz and G. Wicherski. Know your enemy: Tracking botnets. 2005.
    [21] D. Dagon R. Edmonds P. Royal, M. Halpin and W. Lee. Polyunpack: Automating the hidden-code extraction of unpack-executing malware. ACSAC, 2006.
    [22] Wenke Lee R. Perdisci and Nick Feamster. Behavioral clustering of http-based malware and signature generation using malicious network traces. In USENIX Symposium on Networked Systems Design and Implementation, NSDI, 2010.
    [23] Sam T. Roweis and Lawrence K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
    [24] S. Russell and P. Norvig. Artificial intelligence. pages 694–695, 2010.
    [25] P. Szor. The art of computer virus research and defense. 2005.
    [26] S. Trilling. Project green baycalling a blitz on packers. cio digest strategies and analysis from symantec. 2008.
    [27] C. Hlauschek C. Kruegel U. Bayer, P. M. Comparetti and E. Kirda. Scalable, behaviorbased malware clustering. In Proceedings of 16th ACM Conference on Computer and Communications Security, 2009.
    [28] T. Chiueh X. Hu and K.G. Shin. Large-scale malware indexing using function-call graphs. In Proceedings of 16th ACM Conference on Computer and Communications Security, 2009.

    無法下載圖示 全文公開日期 2017/07/23 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE