簡易檢索 / 詳目顯示

研究生: 黃煌
HUANG HUANG
論文名稱: 生成可附著的對抗補丁使物件辨識網路出錯
Generating attachable adversarial patches to make the object identification wrong based on neural networks
指導教授: 洪西進
Shi-Jinn Horng
口試委員: 李正吉
Cheng-Chi Lee
楊昌彪
Chang-Biau Yang
楊竹星
Chu-Sing Yang
林韋宏
Weber Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 60
中文關鍵詞: 深度學習神經網路黑盒攻擊對抗補丁
外文關鍵詞: deep learning, neural network, black box attack, adversarial patch
相關次數: 點閱:208下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

現今眾多的深度學習技術,例如圖像分類、目標檢測、圖像識別、語義分割等等的網路架構都可以在一些相對理想的環境上得出不錯的結果,不過一旦添加了一些擾動,或資料被輕微的修改便會導致已經訓練好的網路架構無法取得我們想要的結果。研究表明深度學習領域中存在欺騙行為,偽裝後的圖片或者物件有時可以輕而易舉地欺騙已被訓練好的模型。對抗樣本(Adversarial example)便是通過較小的擾動使得神經網路出現錯誤的分類,往往這種細微的擾動對人類的認知無傷大雅,不過對於神經網路可是致命的。有的擾動攻擊甚至人類都無法通過觀察發現,但對網路依舊有著擾動攻擊作用。現今為止,還沒有任何一種方法可以抵擋住各式各樣的擾動攻擊,因此讓人們對神經網路有了更多的疑慮。本論文的擾動攻擊方法具有高準確度以及高效能的特點,本論文提出了3個不同的子模型,通過攻擊範圍模型可以有效的縮小攻擊範圍並指導擾動演算法進行準確的擾動攻擊,可以降低執行時間並提高模型的運行效率。擾動攻擊模型可以通過擾動演算法生成不同的擾動補丁,該擾動補丁小巧且可以人為製作,這是一種天然的擾動補丁,不會將輸入圖像加入大量雜訊,該擾動補丁可以直接附著在原圖上對目標模型進行高效準確的擾動攻擊。本論文的模型是一種黑盒攻擊模型,不需要目標模型的參數和結構資訊,可以對資料集及自然圖片進行有效的擾動攻擊,通過生成一個微小的對抗補丁可以使擾動攻擊成功率達到70.1%。本論文在探討了對於不同神經網路的擾動攻擊效果的基礎上,還研發了一種on-line的系統,可以通過攝像頭或從網路抓取圖像,將其輸入到目標網路進行擾動攻擊,進而輸出擾動成功後的圖像以及對抗補丁應該放置的位置。為了驗證該結果的有效性,可以事先將該位置的對抗補丁附著在輸入圖像上,通過目標網路進行識別,同樣目標網路會出現錯誤的分類識別從而驗證該擾動方法的有效性。


Nowadays, many deep learning technologies, such as image classification, target detection, image recognition, semantics segmentation, and so on, can achieve good results in some relatively ideal environments. However, once some disturbance is added, or the data is slightly modified, the network architecture that we have trained before cannot achieve the desired results. Research has shown that deception exists in the field of deep learning, and disguised pictures or objects can sometimes easily deceive trained models. Adversarial example is the one that can make our network misclassification through small disturbance, which are often harmless to human cognition but fatal to neural networks. Some disturbance attack methods cannot even be observed by human, but they still have disturbance attack effect on the network. Nowadays, there is no way to resist all kinds of disturbance attacks, which makes people have more doubts about the architecture of the network. Research on adversarial examples can provide a deeper understanding of the calculation process of the neural network, and can also prevent unexpected attacks where the neural network is weak. The perturbation attack method in this paper has high accuracy and efficiency. Three different sub-models are proposed in this research. The attack scope model can effectively reduce the attack range and guide the adversarial algorithm to conduct accurate perturbation attack, which can reduce the running time and improve the efficiency of the model. Adversarial attack models can generate different adversarial patches through adversarial algorithms. The adversarial patches are compact and can be manufactured artificially. This is a natural disturbance patch that does not add a lot of noise to the input image. This disturbance patch can be directly attached to the original map to efficiently and accurately disturb the target model. The model in this paper is a black-box attack model. It does not need the parameters and structure information of the target model. It can effectively disturb datasets and natural pictures. The success rate of disturbance by generating a small patch is 70.1%. This paper also discusses the perturbation attack effects on different neural networks. Besides, an on-line system has been built which can capture images by camera or from the network, input them into the perturbation attack model, and output the image after successful perturbation and the location where the patch should be placed. To verify the validity of this result, patches for this location can be attached to the input image beforehand. The validity of the perturbation method is verified by identifying the target network, which also results in incorrect classification.

第一章緒論 9 1.1研究動機與目的 9 1.2相關研究 10 第二章系統架構及硬體規格 12 2.1系統架構 12 2.2 相關硬體規格 12 第三章卷積神經網路及物件辨識介紹 13 3.1卷積神經網路 13 3.2物件辨識系統LENET-5 15 3.3物件辨識系統RESNET 17 第四章對抗例介紹 20 4.1對抗擾動/對抗樣本 20 4.2對抗樣本攻擊 22 4.3對抗樣本攻擊技術 26 第五章研究方法及結果 30 5.1研究流程說明 30 5.2物件識別模型介紹 30 5.3模型介紹 33 5.3.1預處理模型 33 5.3.2攻擊範圍模型 37 5.3.3擾動攻擊模型 43 5.4 實驗結果: 47 第六章 結論 57 6.1 研究成果 57 6.2未來展望 57

[1] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow,and Rob Fergus, “Intriguing properties of neural networks,” arXiv:1313.6199,2014

[2]Ian J. Goodfellow, Jonathon Shlens and Christian Szegedy, “Explaining and Harnessing Adversarial Examples, ” ICLR, arXiv:1412.6572, 2015

[3]Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi and Pascal Frossard, “DeepFool: a simple and accurate method to fool deep neural networks,” IEEE, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574-2582.

[4]Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh and Patrick McDaniel,“Ensemble adversarial training: Attacks and defenses,” ICLR, arXiv:1705.07204, 2018

[5]Jiawei Su, Danilo Vasconcellos Vargas and Sakurai Kouichi, “One pixel attack for fooling deep neural networks,” IEEE Transactions on Evolutionary Computation, Vol. 23, Issue. 5, pp. 828-841, 2019.

[6] Valentina Zantedeschi, Maria-Irina Nicolae and Ambrish Rawat, “Efficient defenses against adversarial attacks, ” arXiv:1707.06728 ,2017

[7]Ali Shafahi, Mahyar Najibi, Zheng Xu, John Dickerson, Larry S. Davis and Tom Goldstein,“Universal adversarial training,” AAAI, arXiv:1811.11304, 2020

[8]Y. Lecun,L. Bottou, Y. Bengio and P. Haffner,“Gradient-based learning applied to document recognition,” Proceedings of the IEEE ( Volume: 86, Issue: 11, Nov. 1998), pp. 2278-2324, 1998.

[9]Kaiming He, Xiangyu Zhang, Shaoqing Ren and Jian Sun, “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770-778, 2016.

[10] Kevin Eykholt, Ivan Evtimov, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno and Dawn Song, “Robust physical-world attacks on deep learning models”, CVPR, arXiv:1707.08945, 2018.

[11] Thomas Tanay and Lewis Griffin, “A boundary tilting persepective on the phenomenon of adversarial examples,” arXiv:1608.07690, 2016

[12] Andras Rozsa, Ethan M. Rudd and Terrance E. Boult, “Adversarial diversity and hard positive generation,” CVPR, Computer Vision and Pattern Recognition , arXiv:1605.01775, pp.410-417, 2016.

[13] Alexey Kurakin, Ian Goodfellow and Samy Bengio, “Adversarial examples in the physical world,” CVPR, Computer Vision and Pattern Recognition, arXiv:1607.02533, 2017.

[14] Nicholas Carlini and David Wagner, “Towards evaluating the robustness of neural networks,” CVPR, Computer Vision and Pattern Recognition, arXiv:1608.04644, 2017.

[15] Rainer Storn and Kenneth Price, “Differential evolution – A simple and efficient heuristic for global optimization over continuous spaces,” Journal of Global Optimization volume 11, pp. 341-359, 1997.

[16] Alex Krizhevsky,Vinod Nair and Geoffrey Hinton “The cifar-10 dataset” https://www.cs.toronto.edu/~kriz/cifar.html,2009

[17] Ping-Chen Tsai, “Alter the most sensitive image points to make face recognition wrong”, C338840,R 008.815.496,2020

QR CODE