簡易檢索 / 詳目顯示

研究生: 宋狄勳
Ti-Hsun Sung
論文名稱: 基於跨語言嵌入生成和檢測對抗式語碼轉換樣本
Generating and Detecting Adversarial Code-switching Examples Using Cross-lingual Embeddin
指導教授: 李漢銘
Hahn-Ming Lee
口試委員: 李漢銘
Hahn-Ming Lee
林豐澤
Feng-Tse Lin
毛敬豪
Ching-Hao Mao
鄧惟中
Wei-Chung Teng
邱舉明
Ge-Ming Chiu
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 72
中文關鍵詞: 對抗式樣本跨語言嵌入詞嵌入語碼轉換
外文關鍵詞: adversarial example, cross-lingual embedding, word embedding, code-switching
相關次數: 點閱:212下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年 AI 被大家廣為使用和依賴,然而對抗式樣本的出現卻影響了 AI 模型 的穩健性。對抗式樣本是一種特別設計的樣本,以其作為模型的輸入會讓模型 做出錯誤的預測,然而這些樣本卻不會影響人類的判定。不同於以往在圖形上 的研究,我們將對抗式樣本研究聚焦在文本上。 我們提出了一個對抗式語碼轉換樣本的生成方法和一個對抗式語碼轉檢測 器來防禦所提出的攻擊。對抗式語碼轉換生成方法通過添加存在於跨語言嵌入 中的語義偏差加入單語文本中,以生成造成 AI 模型誤判的雙語語碼轉換文本。 而通過所提出的方法生成的資料集也一個黑箱模型驗證方法有效性。此外,藉 由對資料集的分析結果挑選了適當的特徵來訓練對抗性代碼轉換檢測器。 本研究結果顯示,利用語義偏差來生成的對抗式語碼轉換樣本有 26.08% 的 機會可以成功使 API 造成誤判,並且所訓練的對抗式語碼轉換檢測器的 AUC 為 0.72,已達到可接受的檢測水平。本研究主要有以下幾點貢獻:(1) 提出一 個以語碼轉換作為擾動的對抗式樣本,並確實造成模型的誤判。(2) 生成手法 是無須任何目標模型的參數和架構就能達成攻擊的黑箱攻擊。(3) 提供的資料 集可作為後續研究的資源。(4) 通過分析資料集得到可能造成模型誤判的原因。 (5) 參考更多的特徵來訓練檢測器並提供了初步的防禦


    In recent years, Artificial Intelligence (AI) has been widely used and relied on by human beings, but the adversarial example has affected the robustness of the AI models. Adversarial example is a specially designed sample, which may mislead an AI model, but do not impact human judgment. Unlike previous studies on images, we focused adversarial example research on text. We propose an adversarial code-switching generation method and an adversarial code-switching detector to defense the proposed attack. The adversarial code-switching generation method converts a monolingual text to a bilingual code-switching by a cross-lingual embedding. Since the semantic bias exist in cross-lingual embedding, by adding the bias to a pure example may mislead an AI model. And a dataset generated by the proposed method is also used to verify the effectiveness by a black-box model. Furthermore, the features are selected to train the adversarial code-switching detector through the analysis of the data. The results of this study show that 26.08% of the adversarial code-switching example generated by semantic bias can successfully mislead Cloud Natural Language API. And the trained adversarial detector AUC is 0.72, which can achieve acceptable detection. The main contributions in this research are as follow: (a) The proposed semantic bias can indeed perturb the text and cause the model misleading. (b) The generation method is a black-box attack, which can be achieved without knowing the parameters and algorithms in the model. (c) The proposed dataset can be used as a resource for future research. (d) The possible reasons of model misleading are analyzed through researching dataset. (e) Refer to more features to train the detector and provide a preliminary defense of the text model.

    中文摘要 ABSTRACT 誌謝 1 Introduction 1.1 Motivation 1.2 Challenges and Goals 1.3 Contributions 1.4 Outline of the Thesis 2 Background and Related Work 2.1 Adversarial Example 2.2 Granularity of Perturbation 2.3 Cross-lingual Embedding 2.4 Code-switching 2.5 Impact of Adversarial Examples in Text 3 System Architecture 3.1 Adversarial Code-switching Dataset Generation 3.1.1 Keyword Separator 3.1.2 Cross-lingual Embedding Generation 3.1.3 Sentence Merger 3.1.4 Labeling 3.2 Adversarial Code-switching Detector 3.2.1 Preprocessing 3.2.2 Feature Extractor 3.2.3 Training Phase 3.2.4 Testing Phase 4 Experiments & Analysis 4.1 Environment Setup and Dataset 4.1.1 Experimental Design 4.1.2 Data Generation and Label 4.2 Dataset Analysis 4.3 Evaluation Metrics 4.4 Effectiveness of Adversarial Code-switching Detector 4.4.1 Feature Testing 4.4.2 Data Imbalance 5 Discussion & Conclusions 5.1 Limitation 5.2 Conclusions

    [1] J. Stilgoe, “Machine learning, social learning and the governance of self-driving cars,” Social studies of science, vol. 48, no. 1, pp. 25–56, 2018.
    [2] A. Vaswani, S. Bengio, E. Brevdo, F. Chollet, A. N. Gomez, S. Gouws, L. Jones, Ł. Kaiser, N. Kalchbrenner, N. Parmar et al., “Tensor2tensor for neural machine translation,” Vol. 1: MT Researchers’Track, p. 193, 2018.
    [3] G. López, L. Quesada, and L. A. Guerrero, “Alexa vs. siri vs. cortana vs. google assistant: a comparison of speech-based natural user interfaces,” in International Conference on Applied Human Factors and Ergonomics. Springer, 2017, pp. 241–250.
    [4] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adver- sarial examples,” arXiv preprint arXiv:1412.6572, 2014.
    [5] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in International Conference on Learning Representations, 2014. [Online]. Available: http: //arxiv.org/abs/1312.6199
    [6] A. Mądry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” stat, vol. 1050, p. 9, 2017.
    [7] “Information security,” accessed on: Jul. 5, 2019. [Online]. Available: https://en.wikipedia.org/wiki/Information_security
    [8] J. H. Metzen, M. C. Kumar, T. Brox, and V. Fischer, “Universal adversarial per- turbations against semantic image segmentation,” in 2017 IEEE International Conference on Computer Vision (ICCV). IEEE, 2017, pp. 2774–2783.
    [9] R. Jia and P. Liang, “Adversarial examples for evaluating reading comprehension systems,” arXiv preprint arXiv:1707.07328, 2017.
    [10] N. Carlini and D. Wagner, “Audio adversarial examples: Targeted attacks on speech-to-text,” in 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 2018, pp. 1–7.
    [11] E. Cambria, S. Poria, D. Hazarika, and K. Kwok, “Senticnet 5: Discovering con- ceptual primitives for sentiment analysis by means of context embeddings,” in Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
    [12] Z. Xiao, M. X. Zhou, Q. V. Liao, G. Mark, C. Chi, W. Chen, and H. Yang, “Tell me about yourself: Using an ai-powered chatbot to conduct conversational surveys,” arXiv preprint arXiv:1905.10700, 2019.
    [13] T. Miyato, A. M. Dai, and I. Goodfellow, “Adversarial training methods for semi- supervised text classification,” stat, vol. 1050, p. 6, 2017.
    [14] E. Janfaza, A. Assemi, and S. Dehghan, “Language, translation, and culture,” in REFERENCES 55 International Conference on Language, Medias and Culture, vol. 33, 2012, pp. 83–87.
    [15] T. D. Berney and R. L. Cooper, “Semantic independence and degree of bilin- gualism in two communities,” The Modern Language Journal, vol. 53, no. 3, pp. 182–185, 1969.
    [16] J. Ge, G. Peng, B. Lyu, Y. Wang, Y. Zhuo, Z. Niu, L. H. Tan, A. P. Leff, and J.- H. Gao, “Cross-language differences in the brain network subserving intelligible speech,” Proceedings of the National Academy of Sciences, vol. 112, no. 10, pp. 2972–2977, 2015.
    [17] S.-M. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool: a simple and accurate method to fool deep neural networks,” in Proceedings of the IEEE con- ference on computer vision and pattern recognition, 2016, pp. 2574–2582.
    [18] N. Papernot, P. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The limitations of deep learning in adversarial settings,” in 2016 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2016, pp. 372–387.
    [19] H. Kwon, Y. Kim, H. Yoon, and D. Choi, “Selective untargeted evasion attack: An adversarial example that will not be classified as certain avoided classes,” IEEE Access, vol. 7, pp. 73 493–73 503, 2019.
    [20] W. E. Zhang, Q. Z. Sheng, and A. A. F. Alhazmi, “Generating textual adversarial examples for deep learning models: A survey,” arXiv preprint arXiv:1901.06796, 2019. REFERENCES 56
    [21] N. Carlini and D. Wagner, “Towards evaluating the robustness of neural net- works,” in 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 2017, pp. 39–57.
    [22] “Code-switching,” accessed on: Jul. 5, 2019. [Online]. Available: https: //en.wikipedia.org/wiki/Code-switching
    [23] S. Li, A. Neupane, S. Paul, C. Song, S. V. Krishnamurthy, A. K. R. Chowdhury, and A. Swami, “Adversarial perturbations against real-time video classification systems,” arXiv preprint arXiv:1807.00458, 2018.
    [24] J. Gao, J. Lanchantin, M. L. Soffa, and Y. Qi, “Black-box generation of adversarial text sequences to evade deep learning classifiers,” in 2018 IEEE Security and Privacy Workshops (SPW). IEEE, 2018, pp. 50–56.
    [25] J. Ebrahimi, A. Rao, D. Lowd, and D. Dou, “Hotflip: White-box adversarial ex- amples for text classification,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2018, pp. 31–36.
    [26] M. Alzantot, Y. Sharma, A. Elgohary, B.-J. Ho, M. Srivastava, and K.-W. Chang, “Generating natural language adversarial examples,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018, pp. 2890–2896.
    [27] M. Cheng, J. Yi, H. Zhang, P.-Y. Chen, and C.-J. Hsieh, “Seq2sick: Evaluating the robustness of sequence-to-sequence models with adversarial examples,” arXiv preprint arXiv:1803.01128, 2018. REFERENCES 57
    [28] M. Iyyer, J. Wieting, K. Gimpel, and L. Zettlemoyer, “Adversarial example gen- eration with syntactically controlled paraphrase networks,” in Proceedings of NAACL-HLT, 2018, pp. 1875–1885.
    [29] K. Sakaguchi, K. Duh, M. Post, and B. Van Durme, “Robsut wrod reocginiton via semi-character recurrent neural network,” in Thirty-First AAAI Conference on Artificial Intelligence, 2017.
    [30] T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
    [31] P. Bojanowski, E. Grave, A. Joulin, and T. Mikolov, “Enriching word vectors with subword information,” Transactions of the Association for Computational Linguistics, vol. 5, pp. 135–146, 2017.
    [32] J. Pennington, R. Socher, and C. Manning, “Glove: Global vectors for word rep- resentation,” in Proceedings of the 2014 conference on empirical methods in nat- ural language processing (EMNLP), 2014, pp. 1532–1543.
    [33] A. Conneau, G. Lample, M. Ranzato, L. Denoyer, and H. Jégou, “Word translation without parallel data,” arXiv preprint arXiv:1710.04087, 2017.
    [34] G. Lample, A. Conneau, L. Denoyer, and M. Ranzato, “Unsupervised machine translation using monolingual corpora only,” arXiv preprint arXiv:1711.00043, 2017.
    [35] K. Chandu, T. Manzini, S. Singh, and A. W. Black, “Language informed modeling of code-switched text,” in Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, Jul. 2018, pp. 92–97. REFERENCES 58
    [36] A. Pratapa, M. Choudhury, and S. Sitaram, “Word embeddings for code-mixed language processing,” in Proceedings of the 2018 Conference on Empirical Meth- ods in Natural Language Processing, 2018, pp. 3067–3072.
    [37] C. Sitawarin, A. N. Bhagoji, A. Mosenia, M. Chiang, and P. Mittal, “Darts: De- ceiving autonomous cars with toxic signs,” arXiv preprint arXiv:1802.06430, 2018.
    [38] “Israel arrests palestinian because facebook translated ’good morning’ to ’attack them’,” accessed on: Jul. 6, 2019. [Online]. Available: https://www.haaretz.com/israel-news/ palestinian-arrested-over-mistranslated-good-morning-facebook-post-1. 5459427
    [39] “Hong kongers alarmed by google translation gaffe amid protests against extradition bill,” accessed on: Jul. 6, 2019. [Online]. Available: https://www.straitstimes.com/asia/east-asia/ hong-kongers-alarmed-by-google-translation-gaffe-amid-protests-against-extradition
    [40] S. Kumar and K. M. Carley, “Understanding ddos cyber-attacks using social me- dia analytics,” in 2016 IEEE Conference on Intelligence and Security Informatics (ISI). IEEE, 2016, pp. 231–236.
    [41] “Muse,” accessed on: Jul. 6, 2019. [Online]. Available: https://github.com/ facebookresearch/MUSE
    [42] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” in Proceedings of the REFERENCES 59 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Portland, Oregon, USA: Association for Computational Linguistics, June 2011, pp. 142–150. [Online]. Available: http://www.aclweb.org/anthology/P11-1015
    [43] “Cloud natural language api,” accessed on: Jul. 6, 2019. [Online]. Available: https://cloud.google.com/natural-language/
    [44] “Opencc,” accessed on: Jul. 6, 2019. [Online]. Available: https://github.com/ BYVoid/OpenCC
    [45] “jieba,” accessed on: Jul. 6, 2019. [Online]. Available: https://github.com/fxsjy/ jieba
    [46] “wordnet,” accessed on: Jul. 6, 2019. [Online]. Available: https://wordnet. princeton.edu/
    [47] “Wikipedia,” accessed on: Jul. 6, 2019. [Online]. Available: https://www. wikipedia.org/
    [48] A. Joulin, P. Bojanowski, T. Mikolov, H. Jégou, and E. Grave, “Loss in transla- tion: Learning bilingual word mapping with a retrieval criterion,” arXiv preprint arXiv:1804.07745, 2018.
    [49] E. Loper and S. Bird, “Nltk: the natural language toolkit,” arXiv preprint cs/ 0205028, 2002.
    [50] K. Pearson, “On lines and planes of closest fit to systems of points in space,” The REFERENCES 60 London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 2, no. 11, pp. 559–572, 1901.
    [51] T. Fawcett, “An introduction to roc analysis,” Pattern recognition letters, vol. 27, no. 8, pp. 861–874, 2006.
    [52] “Deep learning, nlp, and representations,” accessed on: Jul. 6, 2019. [Online]. Available: http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/
    [53] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv: 1810.04805, 2018.
    [54] M. E. Peters, M. Neumann, M. Iyyer, M. Gardner, C. Clark, K. Lee, and L. Zettlemoyer, “Deep contextualized word representations,” arXiv preprint arXiv:1802.05365, 2018.

    QR CODE