簡易檢索 / 詳目顯示

研究生: 吳宗翰
Tsung-Han Wu
論文名稱: 一個基於分位數的快速資料投毒攻擊
A Quantile-based Quick Data Poisoning Attack
指導教授: 鄧惟中
Wei-Chung Teng
口試委員: 鍾國亮
Kuo-Liang Chung
鮑興國
Hsing-Kuo Pao
林宗男
Tsung-Nan Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 42
中文關鍵詞: 資料投毒攻擊線性回歸模型機器學習模型
外文關鍵詞: Data poisoning attack, Linear regression model, Machine learning model
相關次數: 點閱:176下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來關於大數據的使用蔚為風潮,因此越來越多的公司或是使用者透過機器學 習來進行大數據的利用或者是特徵萃取,而惡意的攻擊者可能透過精心製造的攻 擊資料來從使用大數據以及機器學習模型的過程之中獲得利益,其中資料投毒攻 擊 (Data poisoning attack) 對機器學習模型的損害尤為嚴重,惡意攻擊者透過放入 有毒的資料點造成機器學習模型的可用性 (Availability) 損害。在本篇研究,我們 期望針對回歸模型提出一個無需過多攻擊者知識且有效的投毒攻擊手法,以此來 揭露攻擊者可能使用的攻擊手法,協助未來的研究者提出防禦手段,達到主動式資安防禦的願景。


    In recent years, the use of big data has become a trend, and more and more companies or users are using machine learning to create benefits from big data. Malicious attackers may exploit this process to their advantage by creating carefully crafted attack data, and data poisoning attacks can cause serious damage to machine learning models, as attack- ers can damage the availability of the model by inserting poisoning data points. In this study, we aim to propose an effective poisoning attack method for regression models that requires minimal attacker knowledge, to expose possible attack methods and assist future researchers in developing defensive measures, achieving the vision of proactive cyberse- curity defense.

    中文摘要................................. i Abstract in English................................. ii Acknowledgements................................. iv Contents................................. v List of Figures................................. viii List of Tables................................. ix 1 緒論...................... 1 1.1 研究背景............ 1 1.2 研究動機及目的........ 3 1.3 研究貢獻............ 4 1.4 論文架構............ 4 2 文獻探討 ............ 5 2.1 對抗式模型........... 5 2.1.1 攻擊者的目標 ........................... 5 2.1.2 攻擊者的知識 ........................... 6 2.1.3 攻擊者的能耐 ........................... 6 2.1.4 對抗式模型中資料投毒攻擊之策略 ............... 6 2.2 資料投毒攻擊 ............................... 7 2.2.1 以梯度為基礎的投毒攻擊 .................... 7 2.2.2 以統計數據為基礎的投毒攻擊.................. 8 2.3 投毒攻擊的轉移性............................. 8 3 研究方法 11 3.1 攻擊架構與場景設定 ........................... 11 3.1.1 攻擊架構.............................. 11 3.1.2 白箱場景.............................. 12 3.1.3 灰箱場景.............................. 12 3.1.4 黑箱場景.............................. 12 3.2 增加投毒點集中程度以加大梯度傾斜的攻擊方法研究 . . . . . . . . . 13 3.3 快速攻擊的一種方法研究......................... 14 3.4 原始模型與替代模型之間的轉移性 ................... 18 3.5 一個可以快速實施並且與模型無關的資料投毒攻擊 . . . . . . . . . . 18 3.5.1 演算法及範例 ........................... 19 4 實驗及分析 21 4.1 資料集與實驗環境............................. 21 4.1.1 資料集 ............................... 21 4.2 實驗設計.................................. 22 4.2.1 白箱場景.............................. 22 4.2.2 灰箱場景.............................. 23 4.3 分析與結果................................. 23 5 Conclusions................................. 27 References................................. 29

    [1] H.Xiao,B.Biggio,G.Brown,G.Fumera,C.Eckert,andF.Roli,“Isfeatureselectionsecureagainst training data poisoning?,” in International Conference on Machine Learning, vol. 37, pp. 1689–1698, 2015.
    [2] M.Jagielski,A.Oprea,B.Biggio,C.Liu,C.Nita-Rotaru,andB.Li,“Manipulatingmachinelearning: Poisoning attacks and countermeasures for regression learning,” in 2018 IEEE Symposium on Security and Privacy (SP), pp. 19–35, 2018.
    [3] N.Müller,D.Kowatsch,andK.Bottinger,“Datapoisoningattacksonregressionlearningandcorre- sponding defenses,” in 2020 IEEE 25th Pacific Rim International Symposium on Dependable Com- puting (PRDC), pp. 80–89, 2020.
    [4] J.Wen,B.Zhao,M.Xue,A.Oprea,andH.Qian,“Withgreatdispersioncomesgreaterresilience:Effi- cient poisoning attacks and defenses for linear regression models,” IEEE Transactions on Information Forensics and Security, 2021.
    [5] “Tay: Microsoft issues apology over racist chatbot fiasco.” https://www.bbc.com/news/ technology-35902104. Accessed: 2023-01-29.
    [6] “蝦皮打「關鍵字」秒現恐怖衝擊受害者崩潰:千萬別點.”https://tw.news.yahoo.com/ %E8%9D%A6%E7%9A%AE%E6%89%93-%E9%97%9C%E9%8D%B5%E5%AD%97-%E7%A7%92%E7%8F%BE% E6%81%90%E6%80%96%E8%A1%9D%E6%93%8A-%E5%8F%97%E5%AE%B3%E8%80%85%E5%B4%A9%E6% BD%B0-%E5%8D%83%E8%90%AC%E5%88%A5%E9%BB%9E-105452699.html. Accessed: 2023-01-29.
    [7] J.Wadleigh,J.Drew,andT.Moore,“Thee-commercemarketforLemons:Identificationandanalysis of websites selling counterfeit goods,” in Proceedings of the 24th International Conference on World Wide Web, p. 1188–1197, 2015.
    [8] C. Carpineto and G. Romano, “Learning to detect and measure fake ecommerce websites in search- engine results,” in Proceedings of the International Conference on Web Intelligence, p. 403–410, Association for Computing Machinery, 2017.
    [9] M.Maktabar,A.Zainal,M.A.Maarof,andM.N.Kassim,“Contentbasedfraudulentwebsitedetection using supervised machine learning techniques,” in Hybrid Intelligent Systems, Springer International Publishing, 2018.
    [10] M.S.Paniagua,E.Fidalgo,E.Alegre,andF.J.Martino,“Fraudulente-commercewebsitesdetection through machine learning,” in Hybrid Artificial Intelligent Systems, pp. 267–279, Springer Interna- tional Publishing, 2021.
    [11] L. Muñoz-González, B. Biggio, A. Demontis, A. Paudice, V. Wongrassamee, E. Lupu, and F. Roli, “Towards poisoning of deep learning algorithms with back-gradient optimization,” Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017.
    [12] M.Barreno,B.Nelson,R.Sears,A.D.Joseph,andJ.D.Tygar,“Canmachinelearningbesecure?,”in Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, p. 16–25, 2006.
    [13] L.Huang,A.D.Joseph,B.Nelson,B.I.Rubinstein,andJ.D.Tygar,“Adversarialmachinelearning,” in Proceedings of the 4th ACM Workshop on Security and Artificial Intelligence, p. 43–58, 2011.
    [14] B.Biggio,B.Nelson,andP.Laskov,“Poisoningattacksagainstsupportvectormachines,”inProceed- ings of the 29th International Coference on International Conference on Machine Learning, p. 1467– 1474, 2012.
    [15] B. Biggio, I. Pillai, S. Rota Bulò, D. Ariu, M. Pelillo, and F. Roli, “Is data clustering in adversarial settings secure?,” in Proceedings of the 2013 ACM Workshop on Artificial Intelligence and Security, p. 87–98, Association for Computing Machinery, 2013.
    [16] B.BiggioandF.Roli,“Wildpatterns:Tenyearsaftertheriseofadversarialmachinelearning,”inPro- ceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, p. 2154– 2156, Association for Computing Machinery, 2018.
    [17] J.Fan,Q.Yan,M.Li,G.Qu,andY.Xiao,“Asurveyondatapoisoningattacksanddefenses,”in2022 7th IEEE International Conference on Data Science in Cyberspace (DSC), pp. 48–55, 2022.
    [18] S. Alfeld, X. Zhu, and P. Barford, “Data poisoning attacks against autoregressive models,” in Pro- ceedings of the Thirtieth AAAI Conference on Artificial Intelligence, p. 1452–1458, AAAI Press, 2016.
    [19] X. Chen, C. Liu, B. Li, K. Lu, and D. Song, “Targeted backdoor attacks on deep learning systems using data poisoning,” 2017.
    [20] P.W.KohandP.Liang,“Understandingblack-boxpredictionsviainfluencefunctions,”inProceedings of the 34th International Conference on Machine Learning - Volume 70, p. 1885–1894, 2017.
    [21] A. Cinà, S. Vascon, A. Demontis, B. Biggio, F. Roli, and M. Pelillo, “The hammer and the nut: Is bilevel optimization really needed to poison linear classifiers?,” 2021.
    [22] A. Demontis, M. Melis, M. Pintor, M. Jagielski, B. Biggio, A. Oprea, C. Nita-Rotaru, and F. Roli, “Why do adversarial attacks transfer? explaining transferability of evasion and poisoning attacks,” in Proceedings of the 28th USENIX Conference on Security Symposium, p. 321–338, 2019.
    [23] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against machine learning,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security, p. 506–519, 2017.
    [24] “kaggle - real estate price prediction.” https://www.kaggle.com/datasets/quantbruce/ real-estate-price-prediction. Accessed: 2023-01-23.
    [25] “Iwpc-internationalwarfarinpharmacogeneticsconsortium.”https://www.pharmgkb.org/page/ iwpc. Accessed: 2023-02-17.

    無法下載圖示 全文公開日期 2025/07/11 (校內網路)
    全文公開日期 2025/07/11 (校外網路)
    全文公開日期 2025/07/11 (國家圖書館:臺灣博碩士論文系統)
    QR CODE