簡易檢索 / 詳目顯示

研究生: 林蔚中
Wei-Chung Lin
論文名稱: 基於深度卷積神經網路合成圖像之通用檢測器
Toward a Generic Deep Convolutional Neural Network Synthetic Image Detector
指導教授: 吳怡樂
Yi-Leh Wu
口試委員: 唐政元
Zheng-Yuan Tang
陳建中
Jian-Zhong Chen
閻立剛
Li-Gang Yan
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 30
中文關鍵詞: 深度學習深度卷積神經網路合成影像檢測殘差神經網絡光譜影像
外文關鍵詞: Deep Learning, Convolution Neural Network, Synthetic Image Detector, ResNet, Spectrum Image
相關次數: 點閱:282下載:20
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,深度學習發展出許多有趣的應用,其中影片及圖片的合成引起了許多關注,透過合成模型可以合成出逼近真實的偽造影像,讓人們很難區分真假,如果不正當地使用合成技術將會帶來許多負面影響。結合深度學習的合成模型有很多種,如何訓練一個檢測器,用來檢測各種合成技術所生成的影像是本論文的目標。本論文提出了一個新的通用檢測模型架構,導入影像像素與影像頻譜資訊到模型裡,進而提高辨識準確率。在實驗中,我們採用單一合成模型生成出的照片當作訓練集,並在11個不同的合成模型下評估我們檢測器的通用效果。此外,我們分析了不同特徵運算、預訓練參數,以及完全連接層的層數對模型的影響,並且調整參數達到更好的效能,與原本的檢測模型比較,辨識率提升了4.8%達到94.8%的平均辨識率。


    Recently, deep learning has developed many interesting applications. The synthesis of videos and images has attracted a lot of public interest and concern. The synthetic model can be used to synthesize realistic images, making it difficult for people to distinguish between real and fake. If the synthesis technology is used improperly, it will bring many negative impacts. There are many deep learning synthesis models. Training a detector to detect images generated by various synthetic models is the goal of our paper. In this paper, we propose a general detection model, which improves the accuracy of the detector by adding image pixels and image spectrum information to the model. In the experiment, we use the images generated by a single synthetic model as the training set, and evaluate the generalize ability of our detector under 11 different synthetic models. In addition, we analyzed the influence of different feature operations, pre-training parameters, and the number of fully connected layers on the model, and adjusted the parameters to achieve better performance. Compared with the original detection model, the average precision has increased by 4.8% and the mean average precision has reached 94.8%.

    論文摘要 I Abstract II Contents III LIST OF FIGURES V LIST OF TABLES VI Chapter 1. Introduction 1 1.1 Research Background 1 1.2 Research Motivation 1 Chapter 2. Related Work 3 2.1 Image Synthesis 3 2.1.1 Generative Adversarial Network 3 2.1.2 Deepfake 4 2.2 Image Feature Extraction 5 2.2.1 Deep Residual Network 5 Chapter 3. Classifiers 7 3.1 Baseline Classifier 7 3.2 Spectrum Classifier 8 3.3 Image Spectrum Classifier 10 Chapter 4. Experiments 11 4.1 Dataset 11 4.2 Evaluate Metrics and Training Details 13 4.3 Performance of The Spectrum Classifier 14 4.4 Performance of The Image Spectrum Classifier 15 4.5 The Impact of Feature Map Operators 15 4.6 The Impact of Fixed Weight and Fine-Tuned Weight 16 4.7 The Impact of Different Numbers of Fully Connected Layers 17 Chapter 5. Conclusions and Future Works 19 References 20

    [1] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, "Generative adversarial nets," in Advances in neural information processing systems, pp. 2672-2680, 2014.
    [2] Tero Karras, Samuli Laine, and Timo Aila, "A style-based generator architecture for generative adversarial networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4401-4410, 2019.
    [3] Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros, "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks."
    [4] Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner, "Faceforensics++: Learning to detect manipulated facial images," in Proceedings of the IEEE International Conference on Computer Vision, pp. 1-11, 2019.
    [5] Sheng-Yu Wang, Oliver Wang, Richard Zhang, Andrew Owens, and Alexei A Efros, "CNN-generated images are surprisingly easy to spot... for now.", 2020.
    [6] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
    [7] Tero Karras, Timo Aila, Samuli Laine, and Jaakko J arXiv preprint arXiv:.10196 Lehtinen, "Progressive growing of gans for improved quality, stability, and variation," 2017.
    [8] Tao Dai, Jianrui Cai, Yongbing Zhang, Shu-Tao Xia, and Lei Zhang, "Second-order attention network for single image super-resolution," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 11065-11074, 2019.
    [9] Han Yang, Ruimao Zhang, Xiaobao Guo, Wei Liu, Wangmeng Zuo, and Ping Luo, "Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7850-7859, 2020.
    [10] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, and Michael %J International journal of computer vision Bernstein, "Imagenet large scale visual recognition challenge," vol. 115, no. 3, pp. 211-252, 2015.
    [11] Xu Zhang, Svebor Karaman, and Shih-Fu Chang, "Detecting and simulating artifacts in gan fake images," in 2019 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1-6: IEEE, 2019.
    [12] Fisher Yu, Ari Seff, Yinda Zhang, Shuran Song, Thomas Funkhouser, and Jianxiong J arXiv preprint arXiv:.03365 Xiao, "Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop," 2015.
    [13] Andrew Brock, Jeff Donahue, and Karen J arXiv preprint arXiv:.11096 Simonyan, "Large scale gan training for high fidelity natural image synthesis," 2018.
    [14] Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo, "Stargan: Unified generative adversarial networks for multi-domain image-to-image translation," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8789-8797, 2018.
    [15] Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu, "Semantic image synthesis with spatially-adaptive normalization," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2337-2346, 2019.
    [16] Qifeng Chen and Vladlen Koltun, "Photographic image synthesis with cascaded refinement networks," in Proceedings of the IEEE international conference on computer vision, pp. 1511-1520, 2017.
    [17] Ke Li, Tianhao Zhang, and Jitendra Malik, "Diverse image synthesis from semantic layouts via conditional IMLE," in Proceedings of the IEEE International Conference on Computer Vision, pp. 4220-4229, 2019.
    [18] Chen Chen, Qifeng Chen, Jia Xu, and Vladlen Koltun, "Learning to see in the dark," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3291-3300, 2018.
    [19] Diederik P Kingma and Jimmy J arXiv preprint arXiv:. Ba, "Adam: A method for stochastic optimization," 2014.
    [20] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700-4708, 2017.
    [21] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E J Communications of the ACM Hinton, "Imagenet classification with deep convolutional neural networks," vol. 60, no. 6, pp. 84-90, 2017.
    [22] Karen Simonyan and Andrew J arXiv preprint arXiv:. Zisserman, "Very deep convolutional networks for large-scale image recognition," 2014.
    [23] Mingxing Tan and Quoc V J arXiv preprint arXiv:.11946 Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," 2019.
    [24] François Chollet, "Xception: Deep learning with depthwise separable convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1251-1258, 2017.

    QR CODE