簡易檢索 / 詳目顯示

研究生: 歐陽良柏
Liang-Bo Ouyang
論文名稱: 使用原始碼層級具有可解釋性的對抗式攻擊評估基於圖惡意程式檢測的強健性
Robustness Evaluation of Graph-based Malware Detection Using Code-level Adversarial Attack with Explainability
指導教授: 李漢銘
Hahn-Ming Lee
鄭欣明
Shin-Ming Cheng
口試委員: 黃俊穎
Chun-Ying Huang
蕭旭君
Hsu-Chun Hsiao
毛敬豪
Ching-Hao Mao
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 60
中文關鍵詞: 控制流圖對抗式樣本對抗式攻擊惡意程式檢測靜態分析可解釋性
外文關鍵詞: Control flow graph, Adversarial example, Adversarial attack, Malware detection, Static analysis, Explainability
相關次數: 點閱:348下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 為了識別現在非常流行且有重大危害的物聯網惡意軟體,許多針對惡意軟體的檢測方法被提了出來,但現有許多檢測方法並不是以實際惡意程式存在的區段進行判別,而只學習到惡意程式與正常程式的差異,並不是非常有價值的檢測方法,因此即使不修改程式的惡意內容,檢測器也很容易被對抗式攻擊誤導。本論文首先針對基於圖及操作碼的特徵來訓練強健的檢測器,可以成功抓到程式的語義。為了分析我們所提出檢測器的準確率及強健性,我們設計了一種原始碼級別的攻擊來生成攻擊檔案,同時保留原有惡意行為。比較特別的是,我們利用可解釋性分析選擇影響較大的特徵來執行我們的攻擊,透過比較攻擊後檢測器的準確率及攻擊影響程度來評估基於圖、基於圖及操作碼特徵檢測器的強健性。實驗結果表明,惡意軟體檢測器應考慮不同面向的特徵,以獲得更高的準確性及強健性。


    In order to identify IoT malware with high popularity and severe damage, many feature-based malware detectors have been proposed in the recent years. However, the existing solution mainly learned the difference between malware and benign-ware to judge whether the sample is malicious, rather than analyzing actual malicious behavior. As a result, the detectors are easily to be mislead by the adversarial attacks even without modifying the malicious content. In this thesis, we first leverage both graph-based and opcode-based features to train a robust detector where semantic of binary can be captured. In order to analyze the resistance and robustness of the proposed detectors, we design a code-level attack to generate mutative and executable binaries while preserving their malicious behavior. In particular, we utilize explainability analysis to select most effective features to execute our code-level attack. With the attack, the robustness of graph-based only and mixed with opcode-based detectors are evaluated by comparing their accuracy and robustness. The experimental results show that different types of features should be considered in a malware detector to achieve both high accuracy and robustness.

    中文摘要i ABSTRACT ii 誌謝iii 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Challenges and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Background and Related Work 7 2.1 Existing Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 Static Malware Detection . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Using Control Flow Graphs for Malware Detection . . . . . . 9 2.1.3 Adversarial Attacks on Malware Detection Methods . . . . . 10 2.1.4 Explainability Analysis for Malware Detection . . . . . . . . 11 2.2 Detection Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Robustness Evaluation of Graph-based Malware Detection with Explainability by attacking 15 3.1 Attack Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1 Random Forest (RF) Detector (a) . . . . . . . . . . . . . . . 15 3.1.2 SHAP Value & RF Importance (b) . . . . . . . . . . . . . . . 17 3.1.3 Attack Function (c) . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Detection Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3 SHAP Value & RF Importance Analysis . . . . . . . . . . . . . . . . 19 3.4 Attack Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4.1 Meaningless Function . . . . . . . . . . . . . . . . . . . . . 22 3.4.2 Full Calling Function . . . . . . . . . . . . . . . . . . . . . . 22 3.4.3 Condition & Initial Function . . . . . . . . . . . . . . . . . . 23 3.4.4 Library Function . . . . . . . . . . . . . . . . . . . . . . . . 23 3.5 Adjust After Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4 Experimental Results and Robustness Analysis 25 4.1 Malware Detector & Environment Setup . . . . . . . . . . . . . . . . 25 4.2 Code-level Adversarial Attack . . . . . . . . . . . . . . . . . . . . . 26 4.2.1 Meaningless Function . . . . . . . . . . . . . . . . . . . . . 26 4.2.2 Full Calling Function . . . . . . . . . . . . . . . . . . . . . . 27 4.2.3 Condition & Initial Function . . . . . . . . . . . . . . . . . . 29 4.2.4 Library Function . . . . . . . . . . . . . . . . . . . . . . . . 30 4.2.5 Robustness Analysis . . . . . . . . . . . . . . . . . . . . . . 32 4.3 Attack Existing Detector . . . . . . . . . . . . . . . . . . . . . . . . 34 5 Limitations and Future Work 35 5.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 6 Conclusions 37

    [1] E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. K. Nicholas, “Malware detection by eating a whole exe,” in Proc. Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
    [2] F. Shahzad and M. Farooq, “Elf-miner: Using structural knowledge and data mining methods to detect new (linux) malicious executables,” Knowledge and Information Systems, pp. 589–612, 2012.
    [3] E. M. Dovom, A. Azmoodeh, A. Dehghantanha, D. E. Newton, R. M. Parizi, and H. Karimipour, “Fuzzy pattern tree for edge malware detection and categorization in iot,” Journal of Systems Architecture, pp. 1–7, 2019.
    [4] M. Alhanahnah, Q. Lin, Q. Yan, N. Zhang, and Z. Chen, “Efficient signature generation for classifying cross-architecture iot malware,” in Proc. IEEE Conference on Communications and Network Security, 2018, pp. 1–9.
    [5] H. Alasmary, A. Khormali, A. Anwar, J. Park, J. Choi, A. Abusnaina, A. Awad, D. Nyang, and A. Mohaisen, “Analyzing and detecting emerging internet of things malware: A graph-based approach,” IEEE Internet of Things Journal, pp. 8977–8988, 2019.
    [6] J. Su, D. V. Vasconcellos, S. Prasad, D. Sgandurra, Y. Feng, and K. Sakurai, “Lightweight classification of iot malware based on image recognition,” in Proc. IEEE Annual Computer Software and Applications Conference, 2018, pp. 664–669.
    [7] L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Explaining vulnerabilities of deep learning to adversarial malware binaries,” arXiv:1901.03583, 2019.
    [8] H. S. Anderson, A. Kharkar, B. Filar, and P. Roth, “Evading machine learning malware detection,” in Proc. Black Hat, 2017.
    [9] C. Yang, J. Xu, S. Liang, Y. Wu, Y. Wen, B. Zhang, and D. Meng, “Deepmal: maliciousness-preserving adversarial instruction learning against static malware detection,” Cybersecurity, pp. 1–14, 2021.
    [10] B. Wang and N. Z. Gong, “Attacking graph-based classification via manipulating the graph structure,” in Proc. ACM SIGSAC Conference on Computer and Communications Security, 2019, pp. 2023–2040.
    [11] A. Abusnaina, A. Khormali, H. Alasmary, J. Park, A. Anwar, and A. Mohaisen, “Adversarial learning attacks on graph-based iot malware detection systems,” in Proc. IEEE International Conference on Distributed Computing Systems, 2019, pp. 1296–1305.
    [12] “Angr,” http://angr.io/.
    [13] Tin Kam Ho, “Random decision forests,” in Proc. International Conference on Document Analysis and Recognition, 1995, pp. 278–282.
    [14] H. HaddadPajouh, A. Dehghantanha, R. Khayami, and K.-K. R. Choo, “A deep recurrent neural network based approach for internet of things malware threat hunting,” Future Generation Computer Systems, pp. 88–96, 2018.
    [15] H. Alasmary, A. A. Abusnaina, R. Jang, M. Abuhamad, A. Anwar, D. Nyang, and D. A. Mohaisen, “Soteria: Detecting adversarial examples in control flow graph-based malware classifiers,” in Proc. IEEE International Conference on Distributed Computing Systems, 2020, pp. 888–898.
    [16] Z. Ma, H. Ge, Y. Liu, M. Zhao, and J. Ma, “A combination method for android malware detection based on control flow graphs and machine learning algorithms,” IEEE Access, pp. 21 235–21 245, 2019.
    [17] T. N. Phu, L. Hoang, N. N. Toan, N. Dai Tho, and N. N. Binh, “C500-cfg: A novel algorithm to extract control flow-based features for iot malware detection,” in Proc. International Symposium on Communications and Information Technologies, 2019, pp. 568–573.
    [18] A. Abusnaina, H. Alasmary, M. Abuhamad, S. Salem, D. Nyang, and A. Mohaisen, “Subgraph-based adversarial examples against graph-based iot malware detection systems,” in Proc. Computational Data and Social Networks. Springer International Publishing, 2019, pp. 268–281.
    [19] R. L. Castro, C. Schmitt, and G. Dreo, “Aimed: Evolving malware with genetic programming to evade detection,” in Proc. IEEE International Conference On Trust, Security And Privacy In Computing And Communications/ IEEE International Conference On Big Data Science And Engineering, 2019, pp. 240–247.
    [20] L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Efficient blackbox optimization of adversarial windows malware with constrained manipulations,” CoRR, 2020.
    [21] B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert, and F. Roli, “Adversarial malware binaries: Evading deep learning for malware detection in executables,” in Proc. European Signal Processing Conference, 2018, pp. 533–537.
    [22] F. Kreuk, A. Barak, S. Aviv-Reuven, M. Baruch, B. Pinkas, and J. Keshet, “Deceiving end-to-end deep learning malware detectors using adversarial examples,” arXiv, 2019.
    [23] M. Ribeiro, S. Singh, and C. Guestrin, ““why should I trust you?”: Explaining the predictions of any classifier,” in Proc. Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. Association for Computational Linguistics, Jun. 2016, pp. 97–101.
    [24] W. Guo, D. Mu, J. Xu, P. Su, G. Wang, and X. Xing, “Lemna: Explaining deep learning based security applications,” in Proc. ACM SIGSAC Conference on Computer and Communications Security, 2018, p. 364–379.
    [25] “Radare2,” https://rada.re/r/.
    [26] “Networkx,” https://networkx.org.
    [27] “基於圖卷積神經網路的惡意樣本家族分類,” https://isc.360.com/2020/detail. html?id=23&vid=169.
    [28] D. Kim, E. Kim, S. K. Cha, S. Son, and Y. Kim, “Revisiting binary code similarity analysis using interpretable feature engineering and lessons learned,” arXiv, vol.abs/2011.10749, 2020.
    [29] Y. Xue, Z. Xu, M. Chandramohan, and Y. Liu, “Accurate and scalable crossarchitecture cross-os binary code search with emulation,” IEEE Transactions on Software Engineering, vol. 45, no. 11, pp. 1125–1149, 2019.

    QR CODE