簡易檢索 / 詳目顯示

研究生: 許閎量
Hung-Liang Hsu
論文名稱: 透過黑箱對抗式攻擊揭露惡意軟體檢測器的特性
Characterizing Malware Detectors via Black-Box Adversarial Attacks
指導教授: 李漢銘
Hahn-Ming Lee
鄭欣明
Shin-Ming Cheng
口試委員: 李育杰
Yuh-Jye Lee
黃意婷
Yi-Ting Huang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 65
中文關鍵詞: 惡意軟體檢測器對抗式攻擊惡意軟體黑箱資訊揭露
外文關鍵詞: Adversarial attack, Mlware detector, Malware, Black-box, Data reveal
相關次數: 點閱:200下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現有的對抗式攻擊大多依賴白箱的場景設置,然而在現實中,防守者會避 免洩露任何有用的資訊給攻擊者,使戰場轉為黑箱的場景設置,這使對抗式攻 擊變成相對具有挑戰性的任務。為了解決這些問題,大多數現有的攻擊方法通 過攻擊代理模型生成對抗式樣本。而這樣的方法,其攻擊的有效性完全取決於 代理模型與目標模型的行為相似度,在替代模型與目標模型訓練特徵不同的情 況下,甚至可能完全不具攻擊效果。因此在本文中,我們提出了一種新的框 架,通過系統性以及迭代方法逐步對惡意軟體添加擾動,並分析這些擾動對檢 測模型輸出的影響,最終推斷出在檢測過程中具有影響力的特徵組合以及目標 模型可能用於訓練的特徵,提供訓練代理模型時需要的資訊,擺脫黑箱場景的 限制。藉此我們也強調惡意軟體檢測器必須考慮潛在的模型訊息洩漏問題,從 而提高其面對對抗式攻擊的魯棒性,為這個領域提供了一個新的視角。


    Most existing adversarial attacks largely rely on white-box scenario. However, in reality, defenders tend to avoid revealing any useful information to the attackers, shifting the battlefield to a black-box scenario, which poses significant challenges for adversarial attacks. To overcome the obstacles, most existing attack methods gener- ate adversarial examples by attacking surrogate models. However, the effectiveness of such methods entirely depends on the behavioral similarity between the surrogate model and the target model. In situations where the surrogate model and the target model are trained on different types of features, the attack may be completely ineffec- tive. Therefore, in this paper, we propose a new framework that systematically and iteratively adds perturbations to malware, and analyzes the impact of these perturba- tions on the output of the detection model. Ultimately, this process infers the influential feature combinations used in the detection process and the features that the target model is likely to use for training. This information aids in training surrogate models, breaking free from the constraints of the black-box setting. Our work also emphasizes that mal- ware detectors must consider potential model information leakage issues to improve their robustness against adversarial attacks, offering a new perspective in this field.

    中文摘要 i ABSTRACT ii 1 Introduction 1 1.1 Motivation................................ 3 1.2 ChallengesandGoals.......................... 4 1.3 Contributions .............................. 6 1.4 OutlineoftheThesis .......................... 7 2 Background and Related Work 9 2.1 StaticMalwareDetection........................ 9 2.1.1 Binary-based .......................... 9 2.1.2 Signature-based......................... 10 2.1.3 Structure-based......................... 12 2.2 AdversarialAttacksonMalwareDetector . . . . . . . . . . . . . . . 14 2.2.1 AdversarialAttacksScenarios ................. 14 2.2.2 FunctionalityPreservingProblem ............... 15 2.2.3 FunctionalityPreservingAttacks. . . . . . . . . . . . . . . . 15 2.3 Explainability Analysis on Machine Learning Models . . . . . . . . . 16 2.3.1 SHAP.............................. 17 2.3.2 Adversarial Attack Based on Model Explainability . . . . . . 17 3 Characterizing Malware Detectors via Assembly-layer Attack and Adaptive- GEA Attack 18 3.1 ThreatModel .............................. 18 3.2 Methodology .............................. 20 3.2.1 Assembly-layer Attack Using Explainability Analysis . . . . 20 3.2.2 Adaptive-GEA(Adaptive Graph Embedding and Augmentation) Attack.............................. 31 3.2.3 Application Scenarios and Advantages . . . . . . . . . . . . 33 4 Experimental Results 35 4.1 Dataset ................................. 35 4.2 TargetModelandExperimentalSettings . . . . . . . . . . . . . . . . 36 4.3 AnalysisonAssembly-layerAttackResults . . . . . . . . . . . . . . 37 4.4 AnalysisonAdaptive-GEAAttackResults. . . . . . . . . . . . . . . 38 4.4.1 AverageIteration........................ 39 4.4.2 Selection Count of Benign-ware samples . . . . . . . . . . . 46 5 Limitations and Future Work 54 5.1 Limitations ............................... 54 5.1.1 LimitationsofAssembly-layerAttack . . . . . . . . . . . . . 54 5.1.2 LimitationsofAdaptive-GEAAttack . . . . . . . . . . . . . 55 5.2 FutureWork............................... 55 6 Conclusions 57

    [1] C.-H.Yang,“Animperceptibleadversarialattackonstructure-basedmalwarede- tectors,” Master, NTUST, Taipei, Taiwan, Jul. 2022.
    [2] C.Kolias,G.Kambourakis,A.Stavrou,andJ.Voas,“DDoSintheIoT:Miraiand other botnets,” IEEE Computer, vol. 50, pp. 80–84, Jul. 2017.
    [3] I. Makhdoom, M. Abolhasan, J. Lipman, R. P. Liu, and W. Ni, “Anatomy of threats to the Internet of Things,” IEEE Commun. Surveys Tuts., vol. 21, no. 2, pp. 1636–1675, Oct. 2018.
    [4] E.Raff,J.Barker,J.Sylvester,R.Brandon,B.Catanzaro,andC.Nicholas,“Mal- ware detection by eating a whole EXE,” in Proc. AAAI 2018, Jun. 2018.
    [5] H. S. Anderson and P. Roth, “EMBER: An open dataset for training static PE malware machine learning models,” arXiv preprint arXiv:1804.04637, Apr. 2018.
    [6] A. D. Raju, I. Y. Abualhaol, R. S. Giagone, Y. Zhou, and S. Huang, “A survey on cross-architectural IoT malware threat hunting,” IEEE Access, vol. 9, pp. 91 686– 91 709, Jun. 2021.
    [7] Q.-D. Ngo, H.-T. Nguyen, V.-H. Lec, and D.-H. Nguyen, “A survey of IoT malware and detection methods based on static features,” ICT Express, vol. 6, no. 4, pp. 280–286, Dec. 2020.
    [8] H. HaddadPajouh, A. Dehghantanha, R. Khayami, and K.-K. R. Choo, “A deep recurrent neural network based approach for internet of things malware threat hunting,” Future Generation Computer Systems, pp. 88–96, Aug. 2018.
    [9] M. Alhanahnah, Q. Lin, Q. Yan, N. Zhang, and Z. Chen, “Efficient signature generation for classifying cross-architecture IoT malware,” in Proc. IEEE CNS 2018, May 2018.
    [10] H. Alasmary, A. Khormali, A. Anwar, J. Park, J. Choi, A. Abusnaina, A. Awad, D. Nyang, and A. Mohaisen, “Analyzing and detecting emerging Internet of Things malware: A graph-based approach,” IEEE Internet Things J., vol. 6, no. 5, pp. 8977–8988, Oct. 2019.
    [11] B. Wu, Y. Xu, and F. Zou, “Malware classification by learning semantic and structural features of control flow graphs,” in Proc. IEEE TrustCom 2021, Oct. 2021, pp. 540–547.
    [12] C.-Y.Wu,T.Ban,S.-M.Cheng,B.Sun,andT.Takahashi,“IoTmalwaredetection using function-call-graph embedding,” in Proc. IEEE PST 2021, Dec. 2021, pp. 1–9.
    [13] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adver- sarial examples,” arXiv preprint arXiv:1412.6572, Mar. 2015.
    [14] W. Fleshman, E. Raff, R. Zak, M. McLean, and C. Nicholas, “Static malware detection & subterfuge: Quantifying the robustness of machine learning and current
    anti-virus,” in Proc. IEEE MALWARE 2018, Oct. 2018, pp. 1–10.
    [15] A. Abusnaina, A. Anwar, S. Alshamrani, A. Alabduljabbar, R. Jang, D. Nyang, and D. Mohaisen, “Systemically evaluating the robustness of ML-based IoT mal- ware detectors,” in Proc. IEEE/IFIP DSN-S 2021, Jun. 2021, pp. 3–4.
    [16] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “To- wards deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:1706.06083, Sep. 2019.
    [17] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural net- works,” in Proc. IEEE S&P 2017, May 2017, pp. 39–57.
    [18] F.Pierazzi,F.Pendlebury,J.Cortellazzi,andL.Cavallaro.,“Intriguingproperties of adversarial ML attacks in the problem space,” in Proc. IEEE S&P 2020, May 2020, p. 1332–1349.
    [19] L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Functionality- preserving black-box optimization of adversarial windows malware,” IEEE Trans. Inf. Forensics Security, vol. 16, pp. 3469–3478, May 2021.
    [20] M. Ebrahimi, N. Zhang, J. Hu, M. T. Raza, and H. Chen, “Binary black-box eva- sion attacks against deep learning-based static malware detectors with adversarial byte-level language model,” in Proc. AAAI Workshop on RSEML, Feb. 2021.
    [21] L. Demetrio, S. E. Coull, B. Biggio, G. Lagorio, A. Armando, and F. Roli, “Ad- versarial EXEmples: A survey and experimental evaluation of practical attacks on machine learning for windows malware detection,” ACM Trans. Privacy and
    Security, vol. 24, no. 4, pp. 1–31, Nov. 2021.
    [22] C. Yang, J. Xu, S. Liang, Y. Wu, Y. Wen, B. Zhang, and D. Meng, “DeepMal: maliciousness-preserving adversarial instruction learning against static malware detection,” Cybersecurity, vol. 4, May 2021.
    [23] A. Abusnaina, A. Khormali, H. Alasmary, J. Park, A. Anwar, and A. Mohaisen, “Adversarial learning attacks on graph-based IoT malware detection systems,” in Proc. IEEE ICDCS 2019, Jul. 2019, pp. 1296–1305.
    [24] B. Yuan, J. Wang, D. Liu, W. Guo, P. Wu, and X. Bao, “Byte-level malware classification based on markov images and deep learning,” Computers & Security, vol. 92, p. 101740, May 2020.
    [25] M. Kalash, M. Rochan, N. Mohammed, N. D. B. Bruce, Y. Wang, and F. Iqbal, “Malware classification with deep convolutional neural networks,” in Proc. IFIP on NTMS, Feb. 2018, pp. 1–5.
    [26] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large- scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
    [27] T. Rezaei and A. Hamze, “An efficient approach for malware detection using PE header specifications,” in Proc. IEEE ICWR 2020, Apr. 2020, pp. 234–239.
    [28] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computa- tion, vol. 9, no. 8, pp. 1735–1780, 1997.
    [29] S. Gülmez and I. Sogukpinar, “Graph-based malware detection using opcode se-
    quences,” in 2021 9th ISDFS, Jun. 2021, pp. 1–5.
    [30] Y.-T.Lee,T.Ban,T.-L.Wan,S.-M.Cheng,R.Isawa,T.Takahashi,andD.Inoue, “Cross platform IoT-malware family classification based on printable strings,” in Proc. IEEE TrustCom 2020, Dec. 2020, pp. 775–784.
    [31] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, “DREBIN: Effective and explainable detection of Android malware in your pocket,” in Proc. NDSS Symposium 2014, Feb. 2014.
    [32] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. Ross, and G. Stringhini, “MaMaDroid: Detecting android malware by building markov chains of behavioral models (extended version),” ACM Trans. Privacy and Se- curity, vol. 22, no. 2, Apr. 2019.
    [33] N.NamaniandA.Khan,“Symbolicexecutionbasedfeatureextractionfordetec- tion of malware,” in 2020 5th ICCCS, Oct. 2020, pp. 1–6.
    [34] L. Massarelli, G. A. D. Luna, F. Petroni, L. Querzoni, and R. Baldoni, “Inves- tigating graph embedding neural networks with unsupervised features extraction for binary analysis,” in Proc. of the 2nd Workshop on Binary Analysis Research, Feb. 2019.
    [35] X.-W. Wu, Y. Wang, Y. Fang, and P. Jia, “Embedding vector generation based on function call graph for effective malware detection and classification,” Neural Computing and Applications, pp. 1–14, 2022.
    [36] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word
    representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
    [37] W.W.Lo,S.Layeghy,M.Sarhan,M.Gallagher,andM.Portmann,“Graphneural network-based android malware classification with jumping knowledge,” arXiv e-prints, 2022.
    [38] J. Yan, G. Yan, and D. Jin, “Classifying malware represented as control flow graphs using deep graph convolutional neural network,” in Proc. IEEE/IFIP DSN 2019, Jun. 2019, pp. 52–63.
    [39] J. D. Herath, P. P. Wakodikar, P. Yang, and G. Yan, “CFGExplainer: Explaining graph neural network-based malware classification from control flow graphs,” in IEEE/IFIP International Conference on DSN, Jun. 2022.
    [40] M. Ebrahimi, N. Zhang, J. Hu, M. T. Raza, and H. Chen, “Binary black-box eva- sion attacks against deep learning-based static malware detectors with adversarial byte-level language model,” arXiv preprint arXiv:2012.07994, 2020.
    [41] A. Abusnaina, A. Khormali, H. Alasmary, J. Park, A. Anwar, and A. Mohaisen, “Adversarial learning attacks on graph-based IoT malware detection systems,” in 2019 IEEE 39th ICDCS, Jul. 2019, pp. 1296–1305.
    [42] A. Abusnaina, H. Alasmary, M. Abuhamad, S. Salem, D. Nyang, and A. Mo- haisen, “Subgraph-based adversarial examples against graph-based iot malware detection systems,” in Proc. Computational Data and Social Networks 2019, Nov. 2019, pp. 268–281.
    [43] S. M.Lundberg and S.-I. Lee, “A unified approach to interpreting model predic-
    tions,” in Proc. NeurIPS 2017, vol. 30, Dec. 2017, pp. 4768–4777.
    [44] L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Explaining vul- nerabilities of deep learning to adversarial malware binaries,” arXiv preprint arXiv:1901.03583, Jan. 2019.
    [45] X. Li, K. Qiu, C. Qian, and G. Zhao, “An adversarial machine learning method based on opcode n-grams feature in malware detection,” in Proc. IEEE DSC 2020, Jul. 2020, pp. 380–387.
    [46] I.Rosenberg,S.Meir,J.Berrebi,I.Gordon,G.Sicard,andE.O.David,“Generat- ing end-to-end adversarial examples for malware classifiers using explainability,” in Proc. IEEE IJCNN 2020, Jul. 2020, pp. 1–10.
    [47] L.-B. Ouyang, “Robustness evaluation of graph-based malware detection using code-level adversarial attack with explainability,” Master, NTUST, Taipei, Tai- wan, Jul. 2021.
    [48] A. S. Rakin, M. H. I. Chowdhuryy, F. Yao, and D. Fan, “Deepsteal: Advanced model extractions leveraging efficient weight stealing in memories,” in Proc. IEEE S&P 2022, 2022, pp. 1157–1174.
    [49] M. Rigaki and S. Garcia, “Stealing and evading malware classifiers and antivirus at low false positive conditions,” Computers & Security 2023, vol. 129, p. 103192, 2023.

    QR CODE