研究生: |
藍翊中 Yi-Jung Lan |
---|---|
論文名稱: |
一個基於多任務級聯式卷積網絡與殘差累積變換的深度神經網路用以偵測影片中偽造人臉影像的方法 A Forgery Detection Method for Fake Human Faces in Videos Based on Multi-task Cascaded Convolutional Networks and Aggregated Residual Transformations for Deep Neural Networks |
指導教授: |
范欽雄
Chin-Shyurng Fahn |
口試委員: |
黃榮堂
Rung-Tang Huang 林啟芳 Chi-Fang Lin 吳怡樂 Yi-Le Wu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 英文 |
論文頁數: | 53 |
中文關鍵詞: | 合成影像偽造 、深度學習 、人臉偵測 、偽造偵測 、多任務級聯式卷積網絡 、殘差累積變換 |
外文關鍵詞: | synthetic image forgery, Deepfake, deep learning, face detection, forgery detection, Multi-task Cascaded Convolutional Networks |
相關次數: | 點閱:230 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
情報與人類社會一樣古老,和人類社會的發展密不可分;人類社會和科技的發展與變化決定了情報型態的演化,小至個人生活大至國際關係,正確傳遞情報的重要性,在人類歷史中一再的被證明,偽造與辨別情報技術的相互攻防從未停止。
這幾年來,由於圖形處理器(GPU)運算速度的快速成長,使得合成圖像的製造和偽造所可能造成的影響日益重要;Deepfake的出現使得AI生成的人像逼真到足以影響人們對於網路上資訊是否正確的判斷。本研究提出一個能夠自動檢測影像是否經過Deepfake變造影像中人臉的方法,用以提供一個快速過濾造假影像的機制,以期減少假資訊對公共社會造成的危害,其商業價值上的應用,可在未來與網路社群平台合作,提供對影像資訊的認證,輔助使用者過濾假訊息,增加假人物身份所捏造出的假資訊,透過社交平台快速傳播的難度。
本論文所提之方法共分兩個階段,分別使用了兩種不同的深度學習架構。我們在第一階段中,採取一個高準確率且高效的多任務級聯式卷積網絡(Multi-task Cascaded Convolutional Networks),用以檢測並提取出影片中的人臉,接著便可進行Deepfake人臉變造影像偵測;在第二階段中,我們使用了最新的深度學習影像分類架構,它是採用殘差累積變換(Aggregated Residual Transformations)的設計,判斷第一階段所提取之人臉影像是否經過變造。於實驗的部份,我們使用FaceForensic++ 公開資料集進行偽造影像偵測性能的評估與分析,在經Deepfake 變造影像中,我們達到90.2% 的準確率,而在未經變造的影像上獲得74.2% 的準確率;同時在我們的使用者調查中,顯示本方法在Deepfake 變造影像中有著優於人類的辨識率。
Information is as old as human society and inseparable from the development of human society. The development and change of human society and technology determine the evolution of information types, ranging from personal life to international relations. The importance of correctly transmitting information has been proven again and again in human history. The mutual attack and defense of forgery and identification information technology has never stopped.
In recent years, owing to the rapid growth of graphics processing unit (GPU), the impact of synthetic image generation and forgery becomes more important day by day. Deepfake presents realistic AI-generated videos of human facies, and it can affect how people determine the legitimacy of information provided online. This thesis presents a method to automatically detect Deepfake tampered human facies in video, which can provide a fast screening mechanism to reduce the harm caused by misinformation to the public society.
Our proposed method consists of two phases with different deep-learning architectures. The first phase uses a high accurate and efficient Multi-task Cascaded Convolutional Networks face detector for extracting human faces. Subsequently, we can adopted these cropped faces image for forgery detection. In the second phase, we use one of the state-of-the-art deep-learning image classifier architectures, which is designed by Aggregated Residual Transformations to determine whether the extracted human face is fake or not. In the experiments, we employ the public dataset FaceForensic++ to evaluate and analyze the forgery detection performance. The accuracy of our proposed method reaches 90.2% in Deepfake manipulated images and 74.2% in the original unmanipulated images. And in our user study, it reveals that this method has a better detection accuracy than humans for Deepfake manipulated images.
[1] F. Farzin, C. Hou, and A. M. Norcia, “Piecing it together: Infants' neural responses to face and object structure,” Journal of Vision, vol. 12, no. 13, p. 6, 2012.
[2] “Deepfakes,” 7 Mar. 2020. [Online]. Available: https://github.com/deepfakes/faces wap.
[3] K. Anthony, “Senate deepfake bill introduced after House passes companion legislation,” 1 May. 2020. [Online]. Available: https://www.biometricupdate.com/ 201911/senate-deepfake-bill-introduced-after-house-passes-companion-legislation.
[4] C. Szegedy et al., “Going deeper with convolutions,” in Proceedings of the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, Massachusetts, pp. 1-9, 2015.
[5] C. Szegedy et al., “Rethinking the inception architecture for computer vision,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, pp. 2818-2826, 2016.
[6] F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, pp. 1251-1258, 2017.
[7] D. Afchar et al., “Mesonet: a compact facial video forgery detection network,” in Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security, Hong Kong, China, pp. 1-7, 2018.
[8] K. Zhang et al., “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
[9] K. He et al., “Deep residual learning for image recognition,” in Proceedings of\ the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, pp. 770-778, 2016.
[10] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[11] S. Xie et al., “Aggregated residual transformations for deep neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, pp. 1492-1500, 2017.
[12] A. Rössler et al., “Faceforensics: A large-scale video dataset for forgery detection in human faces,” arXiv preprint arXiv:1803.09179, 2018.
[13] S. Yang et al., “Wider face: A face detection benchmark,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, Nevada, pp. 5525-5533, 2016.