簡易檢索 / 詳目顯示

研究生: 黃冠樺
Kuan-Hua Huang
論文名稱: 多特徵融合技術之單模式影像人臉防偽辨識
Multi-Feature Fusion Network for Single-modal Face anti-spoofing
指導教授: 花凱龍
Kai-Lung Hua
郭景明
Jing-Ming Guo
口試委員: 王乃堅
Nai-Jian Wang
夏至賢
Chih-Hsien Hsia
王元凱
Yuan-Kai Wang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 72
中文關鍵詞: 人臉防偽辨識活體偵測光流法
外文關鍵詞: Liveness detection, Presentation attack detection
相關次數: 點閱:173下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在這個廣泛使用人臉辨識的時代下,人臉偽造攻擊方式也接連著不斷地出現,從最 簡單的照片攻擊、顯示器攻擊在到更先進的 3D 面具攻擊等各種為了突破辨識系統而產 生的人臉偽造攻擊方式,因此,人臉防偽變得更為重要,要如何防範各種攻擊,是研究 者共同的目的,然而許多研究人員會採用不同的感測器,如:近紅外線感測器、結構光 3D 感測器、ToF 感測器、熱感器、遠程光體積描記圖法,來獲取更多特徵,儘管這些方 法往往能達到更高的效能,但是在大多數的場合中,用戶需要使用自己的設備進行身分 認證,而在這種場合中,依賴其他感測器的演算法就無法使用,因此,人臉防偽只能使 用多數消費性電子設備中的彩色鏡頭(例如:智慧型手機、平板電腦、筆記本電腦)。 基於上述考量,本論文研究聚焦於不使用額外感測器前提下,僅使用彩色影像來辨 識人臉真偽,並提出了使用單模式 RGB 影像作為輸入並融合了多幀特徵(光流法、序 列池化)的人臉防偽辨識演算法,並且在架構上選擇能夠學習不同尺度的網路架構以及 增加空間注意力機制,能夠有效阻擋平面攻擊。 本論文在目前最大的人臉防偽辨識資料集 CASIA-SURF CeFA 底下,並且在平均分 類錯誤率(ACER)與真人分類錯誤率(BPCER)分別優於最先進的(SOTA)結果 1.17%與 2.5%,模型不僅克服了未知平面攻擊以及人種多樣性問題,並改善了單模式(彩色影像) 演算法之效能,效能甚至是接近使用多模式影像(彩色、深度及紅外線影像)之演算法。


    Nowadays, face recognition systems have been widely deployed in real-life applications, such as Face ID, e-commerce security, and airport security check. Various Presentation Attacks (PAs), such as photo attack, display attack, and 3D Mask attack, have been developed to impersonate other identities, leading face anti-spoofing an increasingly critical issue. Recent years, face anti-spoofing heavily relies specific hardware such as structured-light 3D sensors, Time of Flight (ToF) sensors, Near-infrared (NIR) sensors, thermal sensors, etc. In general, these specific sensors can easily achieve good performance on face anti-spoofing. Yet, for most scenarios, the user simply can rely on limited personal device such as RGB cameras that are embedded in most accessable electronic devices such as smartphone, tablet or laptop. In this study, we proposed a model using single-modal (RGB) with multi-feature, i.e., Optical flow and Rank pooling, fusion network. In addition, The bypass connection and spatial attention are applied to fuse the features from different layers. Experimental results show that the proposed method can achieve state-of-the-art result on the biggest cross-ethnicity face antispoofing dataset CASIA-SURF CeFA with the single-modal (RGB) approach, and closes to the multi-modal (RGB, Depth, IR) state-of-the-art result. The average classification error rate (ACER) and Bona Fide Presentation Classification Error Rate (BPCER) are better than the state-of-the-art results by 1.17% and 2.5%, respectively.

    摘要 I Abstract II 誌謝 III 目錄 IV 圖索引 VI 表索引 IX 第一章 緒論 10 1.1 研究背景與動機 10 1.2 人臉偽造攻擊總類 11 1.1 論文架構 15 第二章 文獻探討 16 2.1 人臉防偽技術總類 16 2.1.1 基於活體特徵的方法 16 2.1.2 基於紋理特徵的方法 20 2.1.3 基於3D幾何原理的方法 29 2.1.4 多特徵融合方法 31 2.1.5 使用新趨勢技術的方法 33 2.2 人臉防偽演算法之文獻探討 37 2.2.1 Creating Artificial Modalities to Solve RGB Liveness 37 2.2.2 Multi-Modal Face Anti-Spoofing Based on Central Difference Networks 44 第三章 多特徵融合技術之單模式影像人臉防偽辨識 50 3.1 系統簡介 50 3.2 網路架構 50 3.3 資料庫與資料前處理 54 3.3.1 資料庫- CASIA-SURF CeFA 54 3.3.2 資料前處理 56 3.4 訓練方式 57 第四章 實驗數據與結果 58 4.1 測試環境 58 4.2 評估標準介紹 58 4.3 消融測試與比較 60 4.4 實驗結果 61 第五章 結論與未來展望 63 參考文獻 64

    [1] Y. Taigman, M. Yang, M. A. Ranzato, and L. Wolf, "Deepface: Closing the gap to human-level performance in face verification," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1701-1708.
    [2] Y. Sun, X. Wang, and X. Tang, "Deeply learned face representations are sparse, selective, and robust," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 2892-2900.
    [3] O. M. Parkhi, A. Vedaldi, and A. Zisserman, "Deep face recognition," 2015.
    [4] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823.
    [5] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, "Sphereface: Deep hypersphere embedding for face recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 212-220.
    [6] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, "Arcface: Additive angular margin loss for deep face recognition," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 4690-4699.
    [7] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, "Labeled faces in the wild: A database forstudying face recognition in unconstrained environments," in Workshop on faces in'Real-Life'Images: detection, alignment, and recognition, 2008.
    [8] L. Wolf, T. Hassner, and I. Maoz, "Face recognition in unconstrained videos with matched background similarity," in CVPR 2011, 2011: IEEE, pp. 529-534.
    [9] G. Pan, L. Sun, Z. Wu, and S. Lao, "Eyeblink-based anti-spoofing in face recognition from a generic webcamera," in 2007 IEEE 11th international conference on computer vision, 2007: IEEE, pp. 1-8.
    [10] L. Sun, G. Pan, Z. Wu, and S. Lao, "Blinking-based live face detection using conditional random fields," in International Conference on Biometrics, 2007: Springer, pp. 252-260.
    [11] C. N. Karson, "Spontaneous eye-blink rates and dopaminergic systems," Brain, vol. 106, no. 3, pp. 643-653, 1983.
    [12] K. Kollreider, H. Fronthaler, M. I. Faraj, and J. Bigun, "Real-time face detection and motion analysis with application in “liveness” assessment," IEEE Transactions on Information Forensics and Security, vol. 2, no. 3, pp. 548-558, 2007.
    [13] R. Chesney and D. Citron, "Deepfakes and the new disinformation war: The coming age of post-truth geopolitics," Foreign Aff., vol. 98, p. 147, 2019.
    [14] X. Li, J. Komulainen, G. Zhao, P.-C. Yuen, and M. Pietikäinen, "Generalized face anti-spoofing by detecting pulse from face videos," in 2016 23rd International Conference on Pattern Recognition (ICPR), 2016: IEEE, pp. 4244-4249.
    [15] S. Liu, B. Yang, P. C. Yuen, and G. Zhao, "A 3D mask face anti-spoofing database with real world variations," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2016, pp. 100-106.
    [16] Y. Liu, A. Jourabloo, and X. Liu, "Learning deep models for face anti-spoofing: Binary or auxiliary supervision," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, pp. 389-398.
    [17] S. Liu, P. C. Yuen, S. Zhang, and G. Zhao, "3D mask face anti-spoofing with remote photoplethysmography," in European Conference on Computer Vision, 2016: Springer, pp. 85-100.
    [18] J. Li, Y. Wang, T. Tan, and A. K. Jain, "Live face detection based on the analysis of fourier spectra," in Biometric technology for human identification, 2004, vol. 5404: International Society for Optics and Photonics, pp. 296-303.
    [19] J. Määttä, A. Hadid, and M. Pietikäinen, "Face spoofing detection from single images using micro-texture analysis," in 2011 international joint conference on Biometrics (IJCB), 2011: IEEE, pp. 1-7.
    [20] J. Määttä, A. Hadid, and M. Pietikäinen, "Face spoofing detection from single images using texture and local shape analysis," IET biometrics, vol. 1, no. 1, pp. 3-10, 2012.
    [21] B. S. Manjunath and W.-Y. Ma, "Texture features for browsing and retrieval of image data," IEEE Transactions on pattern analysis and machine intelligence, vol. 18, no. 8, pp. 837-842, 1996.
    [22] N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05), 2005, vol. 1: Ieee, pp. 886-893.
    [23] A. Vedaldi and A. Zisserman, "Efficient additive kernels via explicit feature maps," IEEE transactions on pattern analysis and machine intelligence, vol. 34, no. 3, pp. 480-492, 2012.
    [24] J. Komulainen, A. Hadid, and M. Pietikäinen, "Context based face anti-spoofing," in 2013 IEEE Sixth International Conference on Biometrics: Theory, Applications and Systems (BTAS), 2013: IEEE, pp. 1-8.
    [25] J. Yang, Z. Lei, S. Liao, and S. Z. Li, "Face liveness detection with component dependent descriptor," in 2013 International Conference on Biometrics (ICB), 2013: IEEE, pp. 1-6.
    [26] S. Lazebnik, C. Schmid, and J. Ponce, "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), 2006, vol. 2: IEEE, pp. 2169-2178.
    [27] Z. Boulkenafet, J. Komulainen, and A. Hadid, "Face anti-spoofing based on color texture analysis," in 2015 IEEE international conference on image processing (ICIP), 2015: IEEE, pp. 2636-2640.
    [28] Z. Boulkenafet, J. Komulainen, and A. Hadid, "Face spoofing detection using colour texture analysis," IEEE Transactions on Information Forensics and Security, vol. 11, no. 8, pp. 1818-1830, 2016.
    [29] J. Yang, Z. Lei, and S. Z. Li, "Learn convolutional neural network for face anti-spoofing," arXiv preprint arXiv:1408.5601, 2014.
    [30] A. George and S. Marcel, "Deep pixel-wise binary supervision for face presentation attack detection," in 2019 International Conference on Biometrics (ICB), 2019: IEEE, pp. 1-8.
    [31] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700-4708.
    [32] T. de Freitas Pereira, A. Anjos, J. M. De Martino, and S. Marcel, "LBP− TOP based countermeasure against face spoofing attacks," in Asian Conference on Computer Vision, 2012: Springer, pp. 121-132.
    [33] T. de Freitas Pereira et al., "Face liveness detection using dynamic texture," EURASIP Journal on Image and Video Processing, vol. 2014, no. 1, pp. 1-15, 2014.
    [34] T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," IEEE Transactions on pattern analysis and machine intelligence, vol. 24, no. 7, pp. 971-987, 2002.
    [35] G. Zhao and M. Pietikainen, "Dynamic texture recognition using local binary patterns with an application to facial expressions," IEEE transactions on pattern analysis and machine intelligence, vol. 29, no. 6, pp. 915-928, 2007.
    [36] S. Bharadwaj, T. I. Dhamecha, M. Vatsa, and R. Singh, "Computationally efficient face spoofing detection with motion magnification," in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2013, pp. 105-110.
    [37] H.-Y. Wu, M. Rubinstein, E. Shih, J. Guttag, F. Durand, and W. Freeman, "Eulerian video magnification for revealing subtle changes in the world," ACM transactions on graphics (TOG), vol. 31, no. 4, pp. 1-8, 2012.
    [38] R. Chaudhry, A. Ravichandran, G. Hager, and R. Vidal, "Histograms of oriented optical flow and binet-cauchy kernels on nonlinear dynamical systems for the recognition of human actions," in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009: IEEE, pp. 1932-1939.
    [39] Z. Xu, S. Li, and W. Deng, "Learning temporal features using LSTM-CNN architecture for face anti-spoofing," in 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), 2015: IEEE, pp. 141-145.
    [40] T. Wang, J. Yang, Z. Lei, S. Liao, and S. Z. Li, "Face liveness detection using 3D structure recovered from a single camera," in 2013 international conference on biometrics (ICB), 2013: IEEE, pp. 1-6.
    [41] J. M. Saragih, S. Lucey, and J. F. Cohn, "Deformable model fitting by regularized landmark mean-shift," International journal of computer vision, vol. 91, no. 2, pp. 200-215, 2011.
    [42] J. Bai, T.-T. Ng, X. Gao, and Y.-Q. Shi, "Is physics-based liveness detection truly possible with a single image?," in Proceedings of 2010 IEEE International Symposium on Circuits and Systems, 2010: IEEE, pp. 3425-3428.
    [43] R. I. Hartley and P. Sturm, "Triangulation," Computer vision and image understanding, vol. 68, no. 2, pp. 146-157, 1997.
    [44] Y. Atoum, Y. Liu, A. Jourabloo, and X. Liu, "Face anti-spoofing using patch and depth-based CNNs," in 2017 IEEE International Joint Conference on Biometrics (IJCB), 2017: IEEE, pp. 319-328.
    [45] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.
    [46] A. Jourabloo and X. Liu, "Pose-invariant face alignment via CNN-based dense 3D model fitting," International Journal of Computer Vision, vol. 124, no. 2, pp. 187-203, 2017.
    [47] L. Feng et al., "Integration of image quality and motion cues for face anti-spoofing: A neural network approach," Journal of Visual Communication and Image Representation, vol. 38, pp. 451-460, 2016.
    [48] Y. Li, L.-M. Po, X. Xu, and L. Feng, "No-reference image quality assessment using statistical characterization in the shearlet domain," Signal Processing: Image Communication, vol. 29, no. 7, pp. 748-759, 2014.
    [49] G. Easley, D. Labate, and W.-Q. Lim, "Sparse directional image representations using the discrete shearlet transform," Applied and Computational Harmonic Analysis, vol. 25, no. 1, pp. 25-46, 2008.
    [50] C. Liu, "Beyond pixels: exploring new representations and applications for motion analysis," Massachusetts Institute of Technology, 2009.
    [51] G. Hinton et al., "Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups," IEEE Signal processing magazine, vol. 29, no. 6, pp. 82-97, 2012.
    [52] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Thirty-first AAAI conference on artificial intelligence, 2017.
    [53] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
    [54] Z. Yu et al., "Searching central difference convolutional networks for face anti-spoofing," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 5295-5305.
    [55] H. Liu, K. Simonyan, and Y. Yang, "Darts: Differentiable architecture search," arXiv preprint arXiv:1806.09055, 2018.
    [56] Y. Xu et al., "PC-DARTS: Partial channel connections for memory-efficient architecture search," arXiv preprint arXiv:1907.05737, 2019.
    [57] R. Shao, X. Lan, J. Li, and P. C. Yuen, "Multi-adversarial discriminative deep domain generalization for face presentation attack detection," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 10023-10031.
    [58] D. Li, Y. Yang, Y.-Z. Song, and T. M. Hospedales, "Deeper, broader and artier domain generalization," in Proceedings of the IEEE international conference on computer vision, 2017, pp. 5542-5550.
    [59] H. Li, S. J. Pan, S. Wang, and A. C. Kot, "Domain generalization with adversarial feature learning," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5400-5409.
    [60] A. Parkin and O. Grinchuk, "Creating Artificial Modalities to Solve RGB Liveness," arXiv preprint arXiv:2006.16028, 2020.
    [61] Z. Yu et al., "Multi-modal face anti-spoofing based on central difference networks," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 650-651.
    [62] K. G. Derpanis, "The gaussian pyramid," 2005.
    [63] B. Fernando, E. Gavves, J. Oramas, A. Ghodrati, and T. Tuytelaars, "Rank pooling for action recognition," IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 4, pp. 773-787, 2016.
    [64] T. Joachims, "Training linear SVMs in linear time," in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 2006, pp. 217-226.
    [65] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "Cbam: Convolutional block attention module," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3-19.
    [66] Y. Feng, F. Wu, X. Shao, Y. Wang, and X. Zhou, "Joint 3d face reconstruction and dense alignment with position map regression network," in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 534-551.
    [67] A. Liu et al., "Static and dynamic fusion for multi-modal cross-ethnicity face anti-spoofing," arXiv preprint arXiv:1912.02340, 2019.

    無法下載圖示 全文公開日期 2024/10/06 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE