簡易檢索 / 詳目顯示

研究生: 王勃淵
Po-Yuan Wang
論文名稱: 利用執行順序進行惡意軟體檢測以提升穩健性
Robustness Enhancement of Malware Detection Using Execution Order
指導教授: 鄭欣明
Shin-Ming Cheng
口試委員: 李漢銘
Hahn-Ming Lee
李育杰
Yuh-Jye Lee
黃意婷
Yi-Ting Huang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 48
中文關鍵詞: 惡意軟體穩健性人工智慧機器學習控制流程圖
外文關鍵詞: malware, robustness, artificial intelligence, machine learning, CFG
相關次數: 點閱:336下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 伴隨物聯網(IoT)的快速發展,針對物聯網設備的惡意軟體也因此大量產生。儘管藉由機器學習模型,人們已經可以自動化的檢測惡意軟體的存在與否。但仍然存在極大的隱憂,也就是針對機器學習模型的對抗式攻擊。對抗式攻擊可以藉由模型的反饋,對惡意軟體做出改良,進而產生能夠繞過模型的對抗式樣本,也因此模型的穩健性成為了重要的議題之一。本文中,我們透過考量執行順序來獲取惡意軟體中的惡意意圖,以增進惡意軟體檢測的穩健度。為了衡量模型對於惡意樣本的穩健度,我們實做了兩種對抗式攻擊的手法,並產出真實的對抗式樣本來進行穩健度的驗證。我們的結果說明了執行順序的考量可以在我們的資料集上獲取到相對正確的結果。同時,我們的模型也可以因此保持相對高的模型穩健度。此外,我們進一步驗證了我們的方法對於攻擊程度的穩健性,而該攻擊程度是以攻擊擾動的插入次數所決定的。我們發現我們的方法可以在不同的攻擊程度下保持一致且相對較低的水準。


    With the booming development of Internet of Things (IoT), lots of malware programs targeting IoT devices are generated. Despite the ability to automatically detect the presence of malware through machine learning models, there is still a significant concern known as adversarial attacks targeting these models. Adversarial attacks can leverage feedback from the model to make improvements to malware, and generate adversarial samples that can evade the model. As a result, the robustness of the model has become one of the most important issues.
    In our work, we utilize the execution order to further keep semantic information of the malice hidden in malware programs in order to enhance the robustness of malware detection. To evaluation the the robustness against adversarial samples, we implement two adversarial attack methods to generate authentic adversarial samples to verify our robustness. The results demonstrate that considering the execution order enables us to achieve relatively accurate outcomes on our dataset while maintaining a high standard of robustness. Furthermore, we assess the performance of our method across different attack levels, where the attack level is determined by the number of payload injections. Remarkably, we observe that our method exhibits resistance to the escalation of the attack level, resulting in consistently low evasion rates.

    1 Introduction 1 1.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Background and Related Work 7 2.1 Static Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 Binary-based . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Signature-based . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.3 Structure-based . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Adversarial Attacks on Malware . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Adversarial Attacks Scenarios . . . . . . . . . . . . . . . . . 12 2.2.2 Functionality Preserving Problem . . . . . . . . . . . . . . . 13 2.2.3 Functionality Preserving Attacks . . . . . . . . . . . . . . . . 13 3 Methodology 15 3.1 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Reverse Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Embeddings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.1 Node Embedding . . . . . . . . . . . . . . . . . . . . . . . . 18 3.3.2 Graph Embedding . . . . . . . . . . . . . . . . . . . . . . . 19 3.4 Adversarial Sample Generation . . . . . . . . . . . . . . . . . . . . . 20 4 Experimental Results 22 4.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2 Model Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3 Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.4 Robustness against Adversarial Samples . . . . . . . . . . . . . . . . 26 5 Conclusion 32

    [1] “Malware av-test,” Accessed Jun 25, 2023. [Online]. Available: https:
    //www.av-test.org/en/statistics/malware/
    [2] M. Kalash, M. Rochan, N. Mohammed, N. D. B. Bruce, Y. Wang, and F. Iqbal,
    “Malware classification with deep convolutional neural networks,” in Proc. IFIP
    on NTMS, Feb. 2018, pp. 1–5.
    [3] N. McLaughlin, “Malceiver: Perceiver with hierarchical and multi-modal features for android malware detection,” arXiv preprint arXiv:2204.05994, 2022.
    [4] D. Vij, V. Balachandran, T. Thomas, and R. Surendran, “GRAMAC: A graph
    based android malware classification mechanism,” in Proc. of the 10th ACM CODASPY, Mar. 2020, p. 156–158.
    [5] J. Yan, G. Yan, and D. Jin, “Classifying malware represented as control flow
    graphs using deep graph convolutional neural network,” in IEEE/IFIP international conference on DSN, Jun. 2019, pp. 52–63.
    [6] J. D. Herath, P. P. Wakodikar, P. Yang, and G. Yan, “CFGExplainer: Explaining
    graph neural network-based malware classification from control flow graphs,” in
    IEEE/IFIP International Conference on DSN, Jun. 2022.
    [7] H. Shacham, “The geometry of innocent flesh on the bone: Return-into-libc without function calls (on the x86),” in Proc. of the 14th ACM conference on CCS, Oct.
    2007, pp. 552–561.
    [8] B. Kang, S. Y. Yerima, S. Sezer, and K. McLaughlin, “N-gram opcode analysis
    for android malware detection,” arXiv preprint arXiv:1612.01445, 2016.
    [9] T. K. Tran and H. Sato, “NLP-based approaches for malware classification from
    api sequences,” in 2017 21st Asia Pacific Symposium on IES, Nov. 2017, pp. 101–
    105.
    [10] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for
    biomedical image segmentation,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2015, Nov. 2015, pp. 234–241.
    [11] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Łukasz
    Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
    [12] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of
    deep bidirectional transformers for language understanding,” arXiv preprint
    arXiv:1810.04805, 2018.
    [13] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair,
    A. Courville, and Y. Bengio, “Generative adversarial networks,” Communications
    of the ACM, vol. 63, no. 11, pp. 139–144, Nov. 2020.
    [14] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language models are few-shot learners,” Advances in neural information processing systems, vol. 33, pp. 1877–
    1901, 2020.
    [15] S. Gülmez and I. Sogukpinar, “Graph-based malware detection using opcode sequences,” in 2021 9th ISDFS, Jun. 2021, pp. 1–5.
    [16] C. Yang, J. Xu, S. Liang, Y. Wu, Y. Wen, B. Zhang, and D. Meng, “DeepMal:
    maliciousness-preserving adversarial instruction learning against static malware
    detection,” Cybersecurity, vol. 4, pp. 1–14, May 2021.
    [17] A. Abusnaina, A. Khormali, H. Alasmary, J. Park, A. Anwar, and A. Mohaisen,
    “Adversarial learning attacks on graph-based IoT malware detection systems,” in
    2019 IEEE 39th ICDCS, Jul. 2019, pp. 1296–1305.
    [18] A. Abusnaina, H. Alasmary, M. Abuhamad, S. Salem, D. Nyang, and A. Mohaisen, “Subgraph-based adversarial examples against graph-based iot malware
    detection systems,” in Proc. Computational Data and Social Networks 2019, Nov.
    2019, pp. 268–281.
    [19] Z. Zhang, Y. Li, W. Wang, H. Song, and H. Dong, “Malware detection with dynamic evolving graph convolutional networks,” International Journal of Intelligent Systems, vol. 37, pp. 7261–7280, Mar. 2022.
    [20] X.-W. Wu, Y. Wang, Y. Fang, and P. Jia, “Embedding vector generation based
    on function call graph for effective malware detection and classification,” Neural
    Computing and Applications, pp. 1–14, 2022.
    [21] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word
    representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
    [22] T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
    [23] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio et al.,
    “Graph attention networks,” stat, vol. 1050, no. 20, pp. 10–48 550, May 2017.
    [24] L. Massarelli, G. A. D. Luna, F. Petroni, L. Querzoni, and R. Baldoni, “Investigating graph embedding neural networks with unsupervised features extraction
    for binary analysis,” in Proc. of the 2nd Workshop on Binary Analysis Research,
    Feb. 2019.
    [25] E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. Nicholas, “Malware detection by eating a whole EXE,” in Proc. AAAI 2018, Jun. 2018.
    [26] B. Yuan, J. Wang, D. Liu, W. Guo, P. Wu, and X. Bao, “Byte-level malware
    classification based on markov images and deep learning,” Computers & Security,
    vol. 92, p. 101740, May 2020.
    [27] K. Simonyan and A. Zisserman, “Very deep convolutional networks for largescale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
    [28] T. Rezaei and A. Hamze, “An efficient approach for malware detection using PE
    header specifications,” in Proc. IEEE on ICWR 2020, Apr. 2020, pp. 234–239.
    [29] H. HaddadPajouh, A. Dehghantanha, R. Khayami, and K.-K. R. Choo, “A deep
    recurrent neural network based approach for internet of things malware threat
    hunting,” Future Generation Computer Systems, pp. 88–96, Aug. 2018.
    [30] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
    [31] M. Alhanahnah, Q. Lin, Q. Yan, N. Zhang, and Z. Chen, “Efficient signature
    generation for classifying cross-architecture IoT malware,” in Proc. IEEE CNS
    2018, May 2018, pp. 1–9.
    [32] Y.-T. Lee, T. Ban, T.-L. Wan, S.-M. Cheng, R. Isawa, T. Takahashi, and D. Inoue,
    “Cross platform IoT-malware family classification based on printable strings,” in
    Proc. IEEE TrustCom 2020, Dec. 2020, pp. 775–784.
    [33] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens,
    “DREBIN: Effective and explainable detection of Android malware in your
    pocket,” in Proc. NDSS Symposium 2014, Feb. 2014.
    [34] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. Ross, and
    G. Stringhini, “MaMaDroid: Detecting android malware by building markov
    chains of behavioral models (extended version),” ACM Trans. Privacy and Security, vol. 22, no. 2, Apr. 2019.
    [35] N. Namani and A. Khan, “Symbolic execution based feature extraction for detection of malware,” in 2020 5th ICCCS, Oct. 2020, pp. 1–6.
    [36] H. Alasmary, A. Khormali, A. Anwar, J. Park, J. Choi, A. Abusnaina, A. Awad,
    D. Nyang, and A. Mohaisen, “Analyzing and detecting emerging Internet of
    Things malware: A graph-based approach,” IEEE Internet of Things Journal,
    vol. 6, no. 5, pp. 8977–8988, Oct. 2019.
    [37] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher, and M. Portmann, “Graph neural
    network-based android malware classification with jumping knowledge,” arXiv
    e-prints, 2022.
    [38] F. Pierazzi, F. Pendlebury, J. Cortellazzi, and L. Cavallaro, “Intriguing properties
    of adversarial ml attacks in the problem space,” in 2020 IEEE symposium on SP,
    May 2020, pp. 1332–1349.
    [39] M. Ebrahimi, N. Zhang, J. Hu, M. T. Raza, and H. Chen, “Binary black-box evasion attacks against deep learning-based static malware detectors with adversarial
    byte-level language model,” arXiv preprint arXiv:2012.07994, 2020.
    [40] L. Demetrio, B. Biggio, G. Lagorio, F. Roli, and A. Armando, “Functionality preserving black-box optimization of adversarial windows malware,” IEEE
    Transactions on IFS, vol. 16, pp. 3469–3478, May 2021.
    [41] “Angr,” Accessed July 9, 2023. [Online]. Available: https://angr.io/
    [42] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li,
    and P. J. Liu, “Exploring the limits of transfer learning with a unified text-totext transformer,” The Journal of Machine Learning Research, vol. 21, no. 1, pp.
    5485–5551, Jan. 2020.
    [43] D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint
    arXiv:1312.6114, 2013.
    [44] K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural
    networks?” arXiv preprint arXiv:1810.00826, 2018.
    [45] A. Narayanan, M. Chandramohan, R. Venkatesan, L. Chen, Y. Liu, and S. Jaiswal,
    “graph2vec: Learning distributed representations of graphs,” arXiv preprint
    arXiv:1707.05005, 2017.
    [46] Q. Le and T. Mikolov, “Distributed representations of sentences and documents,”
    in Proc. of the 31st ICML, Jun. 2014, pp. 1188–1196.
    [47] N. Papernot, P. McDaniel, and I. Goodfellow, “Transferability in machine learning: from phenomena to black-box attacks using adversarial samples,” arXiv
    preprint arXiv:1605.07277, 2016.
    [48] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami,
    “Practical black-box attacks against machine learning,” in Proceedings of the
    2017 ACM on Asia conference on computer and communications security, Apr.
    2017, pp. 506–519.
    [49] C.-H. Yang, “An imperceptible adversarial attack on structure-based malware detectors,” Master, NTUST, Taipei, Taiwan, Jul. 2022.
    [50] “Virustotal,” Accessed Jun 25, 2023. [Online]. Available: https://www.virustotal.com/gui/intelligence-overview
    [51] L.-B. Ouyang, “Robustness evaluation of graph-based malware detection using
    code-level adversarial attack with explainability,” Master, NTUST, Taipei, Taiwan, Jul. 2021.

    無法下載圖示 全文公開日期 2025/08/15 (校內網路)
    全文公開日期 2025/08/15 (校外網路)
    全文公開日期 2025/08/15 (國家圖書館:臺灣博碩士論文系統)
    QR CODE