簡易檢索 / 詳目顯示

研究生: 謝義桐
Yi-Tung Hsieh
論文名稱: 使用對抗式匹配分析偵測基於幾何轉換的對抗式樣本
Detecting Geometric Transformation-based Adversarial Attack using Adversarial Matching Analysis
指導教授: 李漢銘
Hahn-Ming Lee
口試委員: 邱舉明
鄧惟中
林豐澤
毛敬豪
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 43
中文關鍵詞: 對抗式樣本對抗式攻擊深度神經網路
外文關鍵詞: Adversarial example, Adversarial attack, Deep Neural Networks
相關次數: 點閱:172下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在這幾年來深度神經網路不斷的發展及進步,在許多的任務中達到令人印
    象深刻的結果,為了要達到更好的結果,眾人不停的在資料集、特徵處理或者
    模型架構、參數調校方面花費許多心力,而模型的穩健性往往被人忽略。對抗
    式攻擊便是一種精心設計以讓模型判斷錯誤為目標的攻擊,不只如此,在產生
    對抗式樣本的過程中會限制攻擊擾動,使其不僅破壞力強,並且還具有不易被
    察覺的特性。

    在對抗式樣本攻防的軍備競賽中,不同以往的,基於幾何轉換的對抗式攻
    擊沒有充斥在影像周圍的擾動,在隱匿性更高的同時,也能使先前的防禦研究
    的效果發揮不如預期。在本篇研究中,我們提出了空間轉換對抗式偵測系統,
    將對抗式樣本的局部像素變化擾動視為一種噪點,並使用影像平滑技術減少擾
    動,在透過對抗式匹配分析兩者之間的匹配程度,以偵測可疑的樣本。

    根據我們的研究結果顯示,透過對抗式匹配分析比對樣本與平滑後的樣本
    之局部像素變化不一致性,可達到86.05% 的F1-measure。本研究有以下幾點
    貢獻:(1) 發展出一個偵測系統,在樣本進入系統前能夠及早的偵測基於幾何
    轉換的對抗式樣本;(2) 透過對抗式匹配分析,提取匹配異常相關特徵。


    Deep Neural Networks has been continuously developing and progressing, and it
    has achieved impressive results in many tasks. However, the robustness of the model
    is not being attentive to. An adversarial attack is an attack that is undetectable and intentionally designed to make the model misclassification.
    Different from previous studies, the adversarial attack based on geometric transformation without adversarial noise is not only more imperceptible but also make the effects of previous defense method not as well as expected. In this thesis, we propose a spatial transformed adversarial detector that treats the local pixel-transformed noise as a kind of image noise and uses image smoothing techniques to reduce the perturbations. By comparing the degree of matching between before and after smoothing is analyzed by adversarial matching analysis to detect adversarial example.
    According to the results, our detector can achieve 86.05% of F1-measure. The main
    contributions of the thesis are as follows: (a) Extracting matching anomaly features
    through adversarial matching analysis; (b) Introduce a detection system that can detect geometric transformation-based adversarial attack early.

    中文摘要 i ABSTRACT ii 誌謝 iii 1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Challenges and Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Outline of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Background and Related Work 6 2.1 Adversarial Example . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Hypotheses on the Existence of Adversarial Examples . . . . . . . . 7 2.3 Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 Adversarial Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 Defense Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3 System Architecture 17 3.1 Spatial Transformed Adversarial Dataset Generation . . . . . . . . . 19 3.1.1 Spatial transformation . . . . . . . . . . . . . . . . . . . . . 19 3.1.2 Objective function . . . . . . . . . . . . . . . . . . . . . . . 20 3.1.3 Bilinear interpolation . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Spatial Transformed Adversarial Detector . . . . . . . . . . . . . . . 22 3.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.2.2 Feature extractor . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2.3 Training phase . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.4 Testing phase . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4 Experiments & Analysis 29 4.1 Environment Setup and Dataset . . . . . . . . . . . . . . . . . . . . . 29 4.1.1 Experimental design . . . . . . . . . . . . . . . . . . . . . . 30 4.1.2 Data generation and label . . . . . . . . . . . . . . . . . . . . 30 4.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.3 Effectiveness of Spatial Transformed Adversarial Detector . . . . . . 33 5 Conclusions and Further Work 35 5.1 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.3 Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

    [1] “How a self-driving uber killed a pedestrian in arizona,” accessed on: Mar. 20,
    2018. [Online]. Available: https://www.nytimes.com/interactive/2018/03/20/us/
    self-driving-uber-pedestrian-killed.html
    [2] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and
    R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:
    1312.6199, 2013.
    [3] “Imagenet,” accessed on: Jun. 30, 2018. [Online]. Available: http://www.
    image-net.org/
    [4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep
    convolutional neural networks,” in Advances in neural information processing
    systems, 2012, pp. 1097–1105.
    [5] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical
    world,” arXiv preprint arXiv:1607.02533, 2016.
    [6] K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash,
    T. Kohno, and D. Song, “Robust physical-world attacks on deep learning visual classification,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1625–1634.
    [7] I. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial
    examples,” in International Conference on Learning Representations, 2015.
    [Online]. Available: http://arxiv.org/abs/1412.6572
    [8] F. Tramèr, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel,
    “Ensemble adversarial training: Attacks and defenses,” arXiv preprint arXiv:
    1705.07204, 2017.
    [9] A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards
    deep learning models resistant to adversarial attacks,” arXiv preprint arXiv:
    1706.06083, 2017.
    [10] F. Liao, M. Liang, Y. Dong, T. Pang, X. Hu, and J. Zhu, “Defense against adversarial
    attacks using high-level representation guided denoiser,” in Proceedings
    of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp.
    1778–1787.
    [11] C. Xiao, J.-Y. Zhu, B. Li, W. He, M. Liu, and D. Song, “Spatially transformed
    adversarial examples,” in International Conference on Learning Representations,
    2018. [Online]. Available: https://openreview.net/forum?id=HyydRMZC-
    [12] J. Su, D. V. Vargas, and K. Sakurai, “One pixel attack for fooling deep neural
    networks,” IEEE Transactions on Evolutionary Computation, 2019.
    [13] “Attacking machine learning with adversarial examples,” accessed on: Feb. 24, 2017. [Online]. Available: https://openai.com/blog/adversarial-example-research/
    [14] A. C. Serban and E. Poll, “Adversarial examples-a complete characterisation of
    the phenomenon,” arXiv preprint arXiv:1810.01185, 2018.
    [15] T. Tanay and L. Griffin, “A boundary tilting persepective on the phenomenon of
    adversarial examples,” arXiv preprint arXiv:1608.07690, 2016.
    [16] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, P. Frossard, and S. Soatto, “Analysis
    of universal adversarial perturbations,” arXiv preprint arXiv:1705.09554,
    2017.
    [17] D. Meng and H. Chen, “Magnet: a two-pronged defense against adversarial examples,”
    in Proceedings of the 2017 ACM SIGSAC Conference on Computer and
    Communications Security. ACM, 2017, pp. 135–147.
    [18] J. Gilmer, L. Metz, F. Faghri, S. S. Schoenholz, M. Raghu, M. Wattenberg, and
    I. Goodfellow, “Adversarial spheres,” arXiv preprint arXiv:1801.02774, 2018.
    [19] L. Schmidt, S. Santurkar, D. Tsipras, K. Talwar, and A. Madry, “Adversarially robust
    generalization requires more data,” in Advances in Neural Information Processing
    Systems, 2018, pp. 5014–5026.
    [20] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool: a simple and
    accurate method to fool deep neural networks,” in Proceedings of the IEEE conference
    on computer vision and pattern recognition, 2016, pp. 2574–2582.
    [21] S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, “Universal adversarial
    perturbations,” in Proceedings of the IEEE conference on computer vision
    and pattern recognition, 2017, pp. 1765–1773.
    [22] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural networks,” in 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017, pp. 39–57.
    [23] N. Papernot, P. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation as a defense
    to adversarial perturbations against deep neural networks,” in 2016 IEEE
    Symposium on Security and Privacy (SP). IEEE, 2016, pp. 582–597.
    [24] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami,
    “The limitations of deep learning in adversarial settings,” in 2016 IEEE European
    Symposium on Security and Privacy (EuroS&P). IEEE, 2016, pp. 372–387.
    [25] W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarial examples
    in deep neural networks,” arXiv preprint arXiv:1704.01155, 2017.
    [26] M. Jaderberg, K. Simonyan, A. Zisserman et al., “Spatial transformer networks,”
    in Advances in neural information processing systems, 2015, pp. 2017–2025.
    [27] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal
    algorithms,” Physica D: nonlinear phenomena, vol. 60, no. 1-4, pp. 259–
    268, 1992.
    [28] D. C. Liu and J. Nocedal, “On the limited memory bfgs method for large scale
    optimization,” Mathematical programming, vol. 45, no. 1-3, pp. 503–528, 1989.
    [29] “Opencv,” accessed on: Jun. 30, 2018. [Online]. Available: https://opencv.org/
    [30] “Scikit-learn,” accessed on: Jun. 30, 2018. [Online]. Available: https:
    //scikit-learn.org/stable/
    [31] B. Dumont, S. Maggio, and P. Montalvo, “Robustness of rotation-equivariant
    networks to adversarial perturbations,” arXiv preprint arXiv:1802.06627, 2018.
    [32] A. Ilyas, S. Santurkar, D. Tsipras, L. Engstrom, B. Tran, and A. Madry, “Adversarial examples are not bugs, they are features,” arXiv preprint arXiv:1905.02175, 2019.

    無法下載圖示 全文公開日期 2024/08/29 (校內網路)
    全文公開日期 2024/08/29 (校外網路)
    全文公開日期 2024/08/29 (國家圖書館:臺灣博碩士論文系統)
    QR CODE