簡易檢索 / 詳目顯示

研究生: 陳子揚
Tzu-Yang Chen
論文名稱: 在組合語言層級對基於圖的物聯網惡意軟體檢測之結構性攻擊
Structural Attack against Graph-­based IoT Malware Detection at Assembly Level
指導教授: 李漢銘
Hahn-Ming Lee
鄭欣明
Shin-Ming Cheng
口試委員: 李育杰
Yuh-Jye Lee
黃俊穎
Chun-Ying Huang
游家牧
Chia-Mu Yu
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 英文
論文頁數: 51
中文關鍵詞: 對抗式攻擊對抗式樣本控制流程圖惡意軟體檢測靜態分析
外文關鍵詞: adversarial attack, adversarial example, control flow graph, malware detection, static analysis
相關次數: 點閱:425下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

惡意軟體在物聯網的資安問題上一直是最重要的威脅之一.最近的研究表示基於機器學習的靜態惡意軟體檢測器在面對未知的惡意軟體有著非常強力的檢測效果.其中,利用control flow graph (CFG) 的graph-based detector能準確表示惡意軟體的語意和流程架構,因此在檢測任務上有著突出的效果.然而,機器學習本質上容易受到對抗式攻擊.對抗式攻擊是經精心擾動輸入樣本來產生能混淆模型的對抗式樣本.近年來,現在有許多對抗式攻擊的研究,致力於將惡意軟體躲過機器學習檢測器的檢測.他們透過擾動或添加少量的Bytes,使得檢測器錯誤分類為良性樣本.為了保持原始樣本的功能性,他們修改的位置通常在程式不重要的地方,並且永遠不會執行到.但若考慮程式執行的流程架構和語意的特徵,這些不會執行的修改並不能有效的影響這些特徵,同理也難以攻擊使用這些特徵作為分類依據的檢測器.因此我們提出了一種強力的Structural攻擊方法,透過在程式注入精心製作指令序列來進行攻擊.相較於其他現有的攻擊方法,我們的攻擊內容會被實際的執行,因此成功可以影響代表了程式架構的CFG特徵,且仍然保留原始Binary的功能性.實驗結果表示,我們的方法規避使用基於圖和基於操作碼特徵的檢測器的任務上取得了非常好的效果.


Malware has been one of the most critical threats to IoT security. Recent studies have shown that machine learning (ML) malware detectors based on static analysis are powerful in identifying unknown malware. Among the detectors, structure-based detector using control flow graph (CFG) can accurately represent malware's semantics and process structure, thereby being considered as an effective solution. Recently, ML is inherently vulnerable to adversarial attacks, where the adversary carefully perturbs input samples to generate adversarial samples that can obfuscate the model. There has been much research on adversarial attacks in recent years, focusing on evading the detection by ML-based malware detectors. They make detectors misclassify malicious samples by perturbing or adding a small number of bytes in non-executable positions during runtime to maintain original functionality. However, the modifications that not focus on program structure will be easily captured by the structure-based detector. This paper proposes an adversarial attack trying to modify the program structure by inserting crafted opcodes into the targeted binary so that general graph-based malware detectors are misled. In particular, we disassemble the targeted binary, investigate the consequent CFG features, and generate four types of attack opcodes while preserving the original binary's functionality. The experimental results show that our method achieves excellent results on evading detectors using graph-based and opcode-based features.

中文摘要 i ABSTRACT ii 誌謝 iv 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Challenges and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 Background and Related Work 6 2.1 Existing Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Static Malware Detection . . . . . . . . . . . . . . . . . . . . 6 2.1.2 CFG Malware Detection . . . . . . . . . . . . . . . . . . . . 8 2.1.3 Challenge of Adversarial Attack on Malware Detection . . . . 9 2.1.4 Functionality­preserving Adversarial Attack . . . . . . . . . . 10 3 Structural Attack against Graph­-based IoT Malware Detection at Assembly level . . . 15 3.1 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Code Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.3 Payload Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.4 Adversarial attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4 Experimental Results and Robustness Analysis 25 4.1 Malware Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.2 Assembly­level Structural attack . . . . . . . . . . . . . . . . . . . . 27 5 Limitations and Future Work 29 5.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6 Conclusions 31

[1] E.Raff, J.Barker, J.Sylvester, R.Brandon, B.Catanzaro, andC.Nicholas, “Mal­
ware detection by eating a whole EXE,” in Proc. AAAI 2018, Jun. 2018.
[2] H. S. Anderson and P. Roth, “EMBER: an open dataset for training static PE
malwaremachinelearningmodels,”arXivpreprintarXiv:1804.04637,Apr.2018.
[3] Z. Ma, H. Ge, Y. Liu, M. Zhao, and J. Ma, “A combination method for an­
droid malware detection based on control flow graphs and machine learning al­
gorithms,” IEEE Access, vol. 7, pp. 21235–21245, Jan. 2019.
[4] H. Alasmary, A. Khormali, A. Anwar, J. Park, J. Choi, A. Abusnaina, A. Awad,
D. Nyang, and A. Mohaisen, “Analyzing and detecting emerging Internet of
Thingsmalware: Agraph­basedapproach,”IEEEInternetThingsJ.,vol.6,no.5,
pp. 8977–8988, Oct. 2019.
[5] T.N.Phu,L.Hoang,N.N.Toan,N.D.Tho,andN.N.Binh,“C500­CFG:Anovel
algorithm to extract control flow­based features for IoT malware detection,” in
Proc. ISCIT 2019, Sep. 2019, pp. 568–573.
[6] H. Alasmary, A. A. Abusnaina, R. Jang, M. Abuhamad, A. Anwar, D. Nyang,
and D. A. Mohaisen, “Soteria: Detecting adversarial examples in control flow
graph­-based malware classifiers,” in Proc. IEEE ICDCS 2020, Nov. 2020, pp.
888–898.
[7] J. Yan, G. Yan, and D. Jin, “Classifying malware represented as control flow
graphsusingdeepgraphconvolutionalneuralnetwork,” inProc. IEEE/IFIP DSN
2019, Jun. 2019, pp. 52–63.
[8] K. Lucas, M. Sharif, L. Bauer, M. K. Reiter, and S. Shintre, “Malware makeover:
BreakingML­basedstaticanalysisbymodifyingexecutablebytes,”inProc.ACM
Asia CCS 2021, May 2021, pp. 744–758.
[9] a.J.P.MohammadrezaEbrahimi,W.Li,J.L.Hu,andH.Chen,“Binaryblack­box
attacks against static malware detectors with reinforcement learning in discrete
action spaces,” in Proc. IEEE SPW 2021, May 2021, pp. 85–91.
[10] L. Demetrio, S. E. Coull, B. Biggio, G. Lagorio, A. Armando, and F. Roli, “Ad­
versarial EXEmples: A survey and experimental evaluation of practical attacks
on machine learning for windows malware detection,” ACM Trans. Privacy and
Security, vol. 24, no. 4, pp. 1–31, Nov. 2021.
[11] C. Yang, J. Xu, S. Liang, Y. Wu, Y. Wen, B. Zhang, and D. Meng, “DeepMal:
maliciousness­preserving adversarial instruction learning against static malware
detection,” Cybersecurity, vol. 4, May 2021.
[12] L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Functionality­
preserving black­box optimization of adversarial windows malware,” IEEE
Trans. Inf. Forensics Security, vol. 16, pp. 3469–3478, May 2021.
[13] M. Ebrahimi, N. Zhang, J. Hu, M. T. Raza, and H. Chen, “Binary black­box eva­
sionattacksagainstdeeplearning­basedstaticmalwaredetectorswithadversarial
byte­level language model,” in Proc. AAAI Workshop on RSEML, Feb. 2021.
[14] A. Abusnaina, A. Khormali, H. Alasmary, J. Park, A. Anwar, and A. Mohaisen,
“Adversarial learning attacks on graph­based IoT malware detection systems,” in
Proc. IEEE ICDCS 2019, Jul. 2019, pp. 1296–1305.
[15] K. Zhao, H. Zhou, Y. Zhu, X. Zhan, K. Zhou, J. Li, L. Yu, W. Yuan, and X. Luo,
“Structuralattackagainstgraphbasedandroidmalwaredetection,” inProc. ACM
Asia CCS 2021, Nov. 2021, p. 3218–3235.
[16] X. Chen, C. Li, D. Wang, S. Wen, J. Zhang, S. Nepal, Y. Xiang, and K. Ren,
“Android HIV: A study of repackaging malware for evading machine­learning
detection,” IEEE Trans. Inf. Forensics Security, vol. 15, pp. 987–1001, Jul. 2019.
[17] F.Pierazzi, F.Pendlebury, J.Cortellazzi, andL.Cavallaro., “Intriguingproperties
of adversarial ML attacks in the problem space,” in Proc. IEEE S&P 2020, May
2020, p. 1332–1349.
[18] D. Park and B. Yener, “A survey on practical adversarial examples for malware
classifiers,” in Proc. ROOTS 2020, Nov. 2020, p. 23–35.
[19] J. Saxe and K. Berlin, “Deep neural network based malware detection using
two dimensional binary program features,” in Proc. IEEE MALWARE 2015, Oct.
2015, pp. 11–20.
[20] M. Alhanahnah, Q. Lin, Q. Yan, N. Zhang, and Z. Chen, “Efficient signature
generation for classifying cross­architecture IoT malware,” in Proc. IEEE CNS
2018, May 2018.
[21] Y.­T. Lee, T. Ban, T.­L. Wan, S.­M. Cheng, R. Isawa, T. Takahashi, and D. Inoue,
“Cross platform IoT­malware family classification based on printable strings,” in
Proc. IEEE TrustCom 2020, Dec. 2020, pp. 775–784.
[22] F. Shahzad and M. Farooq, “ELF­Miner: Using structural knowledge and data
mining methods to detect new Linux malicious executables,” Knowledge and In­
formation Systems, vol. 30, no. 3, pp. 589–612, Mar. 2012.
[23] J. Su, D. V. Vasconcellos, S. Prasad, D. Sgandurra, Y. Feng, and K. Sakurai,
“LightweightclassificationofIoTmalwarebasedonimagerecognition,”inProc.
IEEE COMPSAC 2018, Jul. 2018, pp. 664–669.
[24] X.Liu,Y.Lin,H.Li,andJ.Zhang,“Anovelmethodformalwaredetectiononml­
based visualization technique,” Computers & Security, vol. 89, p. 101682, Feb.
2020.
[25] T.­L. Wan, T. Ban, S.­M. Cheng, Y.­T. Lee, B. Sun, R. Isawa, T. Takahashi, and
D.Inoue,“AnefficientapproachtodetectandclassifyIoTmalwarebasedonbyte
sequences from executable files,” IEEE Open Journal of the Computer Society,
vol. 1, p. 262—275, Nov. 2020.
[26] E. M. Dovom, A. Azmoodeh, A. Dehghantanha, D. E. Newton, R. M. Parizi, and
H.Karimipour,“Fuzzypatterntreeforedgemalwaredetectionandcategorization
in IoT,” Journal of Systems Architecture, pp. 1–7, Aug. 2019.
[27] H. HaddadPajouh, A. Dehghantanha, R. Khayami, and K.­K. R. Choo, “A deep
recurrent neural network based approach for internet of things malware threat
hunting,” Future Generation Computer Systems, pp. 88–96, Aug. 2018.
[28] Y. Qiao, Y. Yang, L. Ji, and Jie, “Analyzing malware by abstracting the frequent
itemsetsinapicallsequences,”inProc.IEEETrustCom2013,Jul.2013,pp.265–
270.
[29] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens,
“Drebin: Effectiveandexplainabledetectionofandroidmalwareinyourpocket,”
in Proc. NDSS 2014, Feb. 2014, pp. 23–26.
[30] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. Ross, and
G. Stringhini, “MaMaDroid: Detecting android malware by building markov
chains of behavioral models (extended version),” ACM Trans. Priv. Secur.,
vol. 22, no. 2, pp. 1–34, Apr. 2019.
[31] S.Zhao,X.Ma,W.Zou,andB.Bai,“DeepCG:classifyingmetamorphicmalware
through deep learning of call graphs,” in Proc. SecureComm 2021, Dec. 2019, p.
171–190.
[32] C.­Y.Wu,T.Ban,S.­M.Cheng,B.Sun,andT.Takahashi,“IoTmalwaredetection
using function­call­graph embedding,” in Proc. PST 2021, Dec. 2021, pp. 1–9.
[33] Y. Xue, Z. Xu, M. Chandramohan, and Y. Liu, “Accurate and scalable cross­
architecture cross­OS binary code search with emulation,” IEEE Trans. Softw.
Eng., vol. 45, no. 11, pp. 1125–1149, Nov. 2019.
[34] D. Kim, E. Kim, S. K. Cha, S. Son, and Y. Kim, “Revisiting binary code simi­
larity analysis using interpretable feature engineering and lessons learned,” arXiv
preprint arXiv:2011.10749, Nov. 2020.
[35] L.­B. Ouyang, “Robustness evaluation of graph­based malware detection using
code­level adversarial attack with explainability,” Master, NTUST, Taipei, Tai­
wan, Jul. 2021.
[36] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large­
scale image recognition,” arXiv preprint arXiv:1409.1556, Sep. 2014.
[37] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recogni­
tion,” in Proc. IEEE CVPR 2016, Dec. 2016, pp. 770–778.
[38] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adver­
sarial examples,” arXiv preprint arXiv:1412.6572, Dec. 2014.
[39] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural net­
works,” in Proc. IEEE S&P 2017, May 2017, pp. 39–57.
[40] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep
learning models resistant to adversarial attacks,” in Proc. ICLR 2018, Jan. 2018.
[41] H. S. Anderson, A. Kharkar, B. Filar, and P. Roth, “Evading machine learning
malware detection,” in Proc. Black Hat USA 2017, Jul. 2017.
[42] F. Kreuk, A. Barak, S. Aviv­Reuven, M. Baruch, B. Pinkas, and J. Keshet, “De­
ceivingend­to­enddeeplearningmalwaredetectorsusingadversarialexamples,”
arXiv preprint arXiv:1802.04528, Feb. 2018.
[43] B. Kolosnjaji, A. Demontis, B. Biggio, D. Maiorca, G. Giacinto, C. Eckert, and
F. Roli, “Adversarial malware binaries: Evading deep learning for malware de­
tection in executables,” in Proc. EUSIPCO 2018, Sep. 2018, pp. 533–537.
[44] L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Explaining vul­
nerabilities of deep learning to adversarial malware binaries,” arXiv preprint
arXiv:1901.03583, Jan. 2019.
[45] R. L. Castro, C. Schmitt, and G. Dreo, “AIMED: Evolving malware with genetic
programming to evade detection,” in Proc. IEEE TrustCom/BigDataSE 2019,
Aug. 2019, pp. 240–247.
[46] D. Kim, E. Kim, S. K. Cha, S. Son, and Y. Kim, “Revisiting binary code simi­
larity analysis using interpretable feature engineering and lessons learned,” arXiv
preprint arXiv:2011.10749, Nov. 2020.
[47] “Angr,” http://angr.io/.
[48] “Networkx,” https://networkx.org.

QR CODE