利用靜態特徵相依性對惡意軟體檢測器後門攻擊｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	羅得 Te Lo
論文名稱：	利用靜態特徵相依性對惡意軟體檢測器後門攻擊 Backdoor Attacks against Malware Detectors Using Static Feature Interdependency
指導教授：	李漢銘 Hahn-Ming Lee 鄭欣明 Shin-Ming Cheng
口試委員:	王紹睿 Shao-Jui Wang 李育杰 Yuh-Jye Lee
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	47
中文關鍵詞：	後門攻擊、操作碼、惡意軟體檢測器、靜態特徵、特徵相依性、惡意軟體、機器學習
外文關鍵詞：	backdoor attack, Opcode, FCG, CFG, malware, malware detector, mechine learning
相關次數：	點閱：233 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，機器學習模型在資訊安全領域取得了重要進展，越來越多的IoT設備依賴機器學習模型進行惡意軟體檢測。惡意軟體的不斷演進迫使基於機器學習的檢測方法需要定期收集訓練資料以確保準確性，這也帶來了潛在的風險，如後門攻擊。後門攻擊是一種通過在訓練數據集的部分樣本注入觸發器，將後門植入模型的攻擊方法。傳統的後門攻擊方法通常只針對特定的特徵提取方式產生觸發器，然而，在不同的檢測情境下，使用者會使用不同的特徵提取方式，進而使該後門攻擊失效。同時，觸發器在不同特徵類型間存在互相依賴和互相影響的關係，導致針對不同特徵提取方式的後門攻擊無法在同一個樣本上實現。也就是說，直接結合針對不同特徵提取方式的觸發器無法實現多特徵的後門攻擊。為了解決這個問題，本研究提出了一種新的後門攻擊方法，透過可解釋性分析，綜合考慮觸發器在各種特徵上的表現來生成觸發器，在面對目前主流的特徵提取方法都能達到後門攻擊的效果。這個方法能夠更有效地應對特徵提取方式的多樣性，從而提高後門攻擊的通用性和靈活性。實驗結果表明，我們的方法在面對不同特徵的檢測器時，整體表現優於傳統的後門攻擊。

Machine learning models have made crucial advancements in the realm of information security, with an escalating number of IoT devices depending on them for malware detection. However, the continuous evolution of malware forces machine learning detection methods to collect training data regularly, ensuring accuracy but also inviting risks such as backdoor attacks. Backdoor attacks involve implanting triggers into a subset of the training dataset, subsequently embedding a backdoor into the model. Traditional backdoor attacks typically generate triggers specific to certain feature extraction methods, which may become ineffective with diverse detection scenarios and feature extraction techniques. Moreover, the interdependence and mutual influence between triggers of different feature types prevent the realization of multi-feature backdoor attacks within a single sample. To overcome this, our research introduces a new backdoor attack strategy. Leveraging interpretability analysis, we generate triggers based on their performance across various features, enabling effective backdoor attacks under all mainstream feature extraction methods. Our approach successfully addresses the diversity of feature extraction techniques, enhancing the versatility and flexibility of backdoor attacks. Experimental results confirm our method's superior performance compared to traditional backdoor attacks in different feature detection scenarios.

中文摘要 i
ABSTRACT ii
Introduction 1
1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Challenges and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Background and Related Work 7
1 ELF File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Static Malware Detection . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Functionality-preserving . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Limitations of Backdoor Attacks in Malware Detection . . . . . . . . 12
4.1 Backdoor Attack . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2 Clean-label Backdoor Attacks . . . . . . . . . . . . . . . . . 12
4.3 Limitations of Backdoor Attacks . . . . . . . . . . . . . . . . 13
5 Explainability Analysis and Applications of Machine Learning Models 13
CONTENTS iv
5.1 SHAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Backdoor Attack on multi-feature Malware Detectors Using Explainability
Analysis 15
1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3 Code Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Backdoor Attack Framework . . . . . . . . . . . . . . . . . . . . . . 19
4.1 Interrelation among Features . . . . . . . . . . . . . . . . . . 19
4.2 Feature Importance Analysis . . . . . . . . . . . . . . . . . . 20
4.3 Trigger Generation . . . . . . . . . . . . . . . . . . . . . . . 21
4.4 Backdoor Attack . . . . . . . . . . . . . . . . . . . . . . . . 23
Experimental Results and Robustness Analysis 25
1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2 Target Model and Experiment Setting . . . . . . . . . . . . . . . . . 26
3 Analysis of Backdoor Attack . . . . . . . . . . . . . . . . . . . . . . 27
Limitations and Future Work 29
1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Conclusions 31
                                

[1] C. Kolias, G. Kambourakis, A. Stavrou, and J. Voas, “DDoS in the IoT: Mirai and other botnets,” IEEE Computer, vol. 50, pp. 80–84, Jul. 2017.
[2] I. Makhdoom, M. Abolhasan, J. Lipman, R. P. Liu, and W. Ni, “Anatomy of
threats to the Internet of Things,” IEEE communications surveys & tutorials,
vol. 21, no. 2, pp. 1636–1675, Oct. 2018.
[3] S.-M. Cheng, P.-Y. Chen, C.-C. Lin, and H.-C. Hsiao, “Traffic-aware patching for
cyber security in mobile IoT,” IEEE Communications Magazine, vol. 55, no. 7,
pp. 29–35, Jul. 2017.
[4] A. D. Raju, I. Y. Abualhaol, R. S. Giagone, Y. Zhou, and S. Huang, “A survey on
cross-architectural IoT malware threat hunting,” IEEE Access, vol. 9, pp. 91 686–
91 709, Jun. 2021.
[5] Z. Zhang, P. Qi, and W. Wang, “Dynamic malware analysis with feature engineering and feature learning,” in Proc. AAAI conference on AI, vol. 34, no. 01,
Apr. 2020, pp. 1210–1217.
[6] A. Küchler, A. Mantovani, Y. Han, L. Bilge, and D. Balzarotti, “Does every sec33
ond count? time-based evolution of malware behavior in sandboxes.” in Proc.
NDSS 2021, Apr. 2021.
[7] E. Raff, J. Barker, J. Sylvester, R. Brandon, B. Catanzaro, and C. Nicholas, “Malware detection by eating a whole EXE,” in Proc. AAAI 2018, Jun. 2018.
[8] J. Su, D. V. Vasconcellos, S. Prasad, D. Sgandurra, Y. Feng, and K. Sakurai,
“Lightweight classification of IoT malware based on image recognition,” in Proc.
IEEE COMPSAC 2018, Jul. 2018, pp. 664–669.
[9] X. Liu, Y. Lin, H. Li, and J. Zhang, “A novel method for malware detection on
ML-based visualization technique,” Computers & Security, vol. 89, p. 101682,
Feb. 2020.
[10] H. S. Anderson and P. Roth, “EMBER: An open dataset for training static PE
malware machine learning models,” arXiv preprint arXiv:1804.04637, Apr. 2018.
[11] H. HaddadPajouh, A. Dehghantanha, R. Khayami, and K.-K. R. Choo, “A deep
recurrent neural network based approach for internet of things malware threat
hunting,” FGCS, pp. 88–96, Aug. 2018.
[12] M. Alhanahnah, Q. Lin, Q. Yan, N. Zhang, and Z. Chen, “Efficient signature
generation for classifying cross-architecture IoT malware,” in Proc. IEEE CNS
2018, May 2018.
[13] Q.-D. Ngo, H.-T. Nguyen, V.-H. Lec, and D.-H. Nguyen, “A survey of IoT malware and detection methods based on static features,” ICT Express, vol. 6, no. 4,
pp. 280–286, Dec. 2020.
[14] Y. Ai, C. Lei, J. Cheng, and J. Mei, “Prediction of weld area based on image
recognition and machine learning in laser oscillation welding of aluminum alloy,”
Optics and Lasers in Engineering, vol. 160, p. 107258, 2023.
[15] V. Mahipal, S. Ghosh, I. T. Sanusi, R. Ma, J. E. Gonzales, and F. G. Martin, “Doodleit: A novel tool and approach for teaching how cnns perform image recognition,” in Proc. ACE 2023, Jan. 2023, pp. 31–38.
[16] X. Li, M. Liu, S. Gao, and W. Buntine, “A survey on out-of-distribution evaluation
of neural nlp models,” arXiv preprint arXiv:2306.15261, 2023.
[17] J. Lin, L. Xu, Y. Liu, and X. Zhang, “Composite backdoor attack for deep neural
network by mixing existing benign features,” in Proc. ACM SIGSAC 2020, Oct.
2020, p. 113–131.
[18] Y. He, Z. Shen, C. Xia, J. Hua, W. Tong, and S. Zhong, “SGBA: A
stealthy scapegoat backdoor attack against deep neural networks,” arXiv preprint
arXiv:2104.01026, 2021.
[19] Y. Zeng, W. Park, Z. M. Mao, and R. Jia, “Rethinking the backdoor attacks’ triggers: A frequency perspective,” in Proc. IEEE/CVF ICCV, Oct. 2021, pp. 16 473–
16 481.
[20] L. Gan, J. Li, T. Zhang, X. Li, Y. Meng, F. Wu, Y. Yang, S. Guo, and C. Fan,
“Triggerless backdoor attack for nlp tasks with clean labels,” arXiv preprint
arXiv:2111.07970, 2021.
[21] G. Severi, J. Meyer, S. Coull, and A. Oprea, “Explanation-Guided backdoor poi-
soning attacks against malware classifiers,” in Proc. USENIX 2021, Aug. 2021,
pp. 1487–1504.
[22] C. Li, X. Chen, D. Wang, S. Wen, M. E. Ahmed, S. Camtepe, and Y. Xiang,
“Backdoor attack on machine learning based android malware detectors,” IEEE
TDSC, vol. 19, no. 5, pp. 3357–3370, 2021.
[23] M.-W. Tsang, “Analysis of invisible data poisoning backdoor attacks against malware classifiers,” Master, NTUST, Taipei, Taiwan, Jul. 2021.
[24] H. Alasmary, A. Khormali, A. Anwar, J. Park, J. Choi, A. Abusnaina, A. Awad,
D. Nyang, and A. Mohaisen, “Analyzing and detecting emerging Internet of
Things malware: A graph-based approach,” IEEE IoT-J, vol. 6, no. 5, pp. 8977–
8988, Oct. 2019.
[25] B. Wu, Y. Xu, and F. Zou, “Malware classification by learning semantic and
structural features of control flow graphs,” in Proc. IEEE TrustCom 2021, Oct.
2021, pp. 540–547.
[26] C.-Y. Wu, T. Ban, S.-M. Cheng, B. Sun, and T. Takahashi, “IoT malware detection
using function-call-graph embedding,” in Proc. IEEE PST 2021, Dec. 2021, pp.
1–9.
[27] “Yale university,” http://flint.cs.yale.edu/cs421/papers/x86-asm/asm.html#
instructions.
[28] S.-Y. Yu, Y. G. Achamyeleh, C. Wang, A. Kocheturov, P. Eisen, and M. A. A.
Faruque, “Cfg2vec: Hierarchical graph neural network for cross-architectural
software reverse engineering,” arXiv preprint arXiv:2301.02723, 2023.
[29] “Executable and linking format (ELF) specification version 1.2,” Tool Interface
Standard (TIS), (1995, May). [Online]. Available: https://refspecs.linuxbase.org/
elf/elf.pdf
[30] X.-W. Wu, Y. Wang, Y. Fang, and P. Jia, “Embedding vector generation based
on function call graph for effective malware detection and classification,” Neural
Computing and Applications, pp. 1–14, Feb. 2022.
[31] J. D. Herath, P. P. Wakodikar, P. Yang, and G. Yan, “CFGExplainer: Explaining
graph neural network-based malware classification from control flow graphs,” in
Proc. IEEE/IFIP DSN 2022, Jun. 2022, pp. 172–184.
[32] J. Yan, G. Yan, and D. Jin, “Classifying malware represented as control flow
graphs using deep graph convolutional neural network,” in Proc. IEEE/IFIP DSN
2019, Jun. 2019, pp. 52–63.
[33] L.-B. Ouyang, “Robustness evaluation of graph-based malware detection using
code-level adversarial attack with explainability,” Master, NTUST, Taipei, Taiwan, Jul. 2021.
[34] H.-T. Nguyen, Q.-D. Ngo4, and V.-H. Le, “A novel graph-based approach for IoT
botnet detection,” International Journal of Information Security, vol. 19, no. 5,
pp. 567–577, Oct. 2020.
[35] S. Gülmez and I. Sogukpinar, “Graph-based malware detection using opcode sequences,” in Proc. IEEE ISDFS 2021, Jun. 2021, pp. 1–5.
[36] A. Pektaş and T. Acarman, “Deep learning for effective Android malware de-
tection using API call graph embeddings,” Soft Computing, vol. 24, no. 2, pp.
1027–1043, Jan. 2020.
[37] E. M. Dovom, A. Azmoodeh, A. Dehghantanha, D. E. Newton, R. M. Parizi, and
H. Karimipour, “Fuzzy pattern tree for edge malware detection and categorization
in IoT,” Journal of Systems Architecture, vol. 97, pp. 1–7, Aug. 2019.
[38] J. Xu, M. Xue, and S. Picek, “Explainability-based backdoor attacks against graph
neural networks,” in Proc. ACM WiseML Workshop 2021, Jun. 2021, pp. 31–36.
[39] S. M.Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” in Proc. NeurIPS 2017, vol. 30, Dec. 2017, pp. 4768–4777.
[40] I. G. Nicolas Papernot, Patrick McDaniel, “Transferability in machine learning:
From phenomena to black-box attacks using adversarial samples,” arXiv preprint
arXiv:1605.07277, May 2016.
[41] C.-H. Yang, “An imperceptible adversarial attack on structure-based malware detectors,” Master, NTUST, Taipei, Taiwan, Jul. 2022.
[42] Z. Zhang, J. Jia, B. Wang, and N. Z. Gong, “Backdoor attacks to graph neural
networks,” in Proc. ACM SACMAT 2021, Jun. 2021, pp. 15–26.
[43] X. Li, K. Qiu, C. Qian, and G. Zhao, “An adversarial machine learning method
based on opcode n-grams feature in malware detection,” in Proc. IEEE DSC 2020,
Jul. 2020, pp. 380–387.
[44] “Virustotal,” https://www.virustotal.com.
[45] Angr, “http://angr.io/.”

全文公開日期 2025/08/15 (校內網路)
全文公開日期 2025/08/15 (校外網路)
全文公開日期 2025/08/15 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文