簡易檢索 / 詳目顯示

研究生: 周宜玫
Yi-Mei Chou
論文名稱: 以視角配對編碼器和正規化進行人臉辨識
Pose-Pairing Encoding and Normalization for Face Recognition
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 花凱龍
Kai-Long Hua
陳祝嵩
Chu-Song Chen
鄭文皇
Wen-Huang Cheng
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 68
中文關鍵詞: 人臉辨識人臉正規化
外文關鍵詞: Face Recognition, Face Normalization
相關次數: 點閱:170下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本研究提出了以視角配對的編碼器與正規化來解決人臉辨識中常見的以下兩個議題。首先,在臉部辨識系統中,正臉化(Frontalization)常被認為是提升辨識效能的好方法,但當臉部處於較極端之視角,再強制轉至正臉的辨識效果反而會下降,甚至會有誤判的情形,為了解決這個問題,本研究設計出視角配對的編碼器(Pose-Pairing Encoders),透過角度間的配對,依據不同的人臉視角,編碼出更具代表性的特徵,以此更好地識別具極端視角的人臉。此外,為了更進一步提升辨識效能,本研究更提出了多視角的正規化,透過生成質量更佳的人臉來提高視角間的匹配性能。第二個議題則是基於模板匹配(template-based matching)的測試方法,此方法透過對模板中的所有圖像特徵進行平均化,隨後將該平均特徵與另一模板之平均特徵進行匹配。在錯誤分析中發現,較具代表性之圖片或圖片品質極差者,平均化會降低其特徵之影響,因此本研究提出了不同於上述的測試方式,稱為角度匹配(Pose-Pairing Matching),透過取得來自模板內所有圖片的角度配對,並使用相應之臉部配對編碼器取得最佳配對之特徵,該方法能更將模板內的圖片進行更有效的使用,同時本研究將此方法驗證於臉部辨識常使用的資料庫IJB-A上,與當前其他先進的臉部辨識模型比較,展現出極具競爭性的結果。


We propose the Pose-Pairing Encoding and Normalization to address the following two issues in face recognition. The first issue is the face frontalization generally considered a good way for face normalization that improves face recognition. The performance of frontalization usually drops when applied to extreme poses. To handle this issue, we design the pose-pairing encoders to encode a face depending on its pose, and conduct the pose-pairing matching for better handling the recognition of faces with extreme poses. Additionally, we propose the pose-pairing normalization to further improve the pose-pairing matching performance by generating better quality faces. The second issue is the template-based matching commonly performed by averaging the features in a template and matching the averaged feature with that of another template. We propose a different scheme that considers the most likely matches from similar pose pairs, transforming the template matching to pose-pairing matching, and verify the effectiveness. Our approach is verified on common benchmark databases and compared with other methods.

摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VII 表目錄 IX 第1章 介紹 1 1.1 研究背景和動機 1 1.2 方法概述 3 1.3 論文貢獻 4 1.4 論文架構 5 第2章 文獻回顧 6 2.1 Face Normalization 6 2.1.1 FNM 6 2.1.2 TP-GAN 7 2.1.3 DR-GAN 8 2.1.4 FF-GAN 9 2.1.5 PIM 10 2.1.6 HF-PIM 12 2.1.7 CAPG-GAN 13 2.2 Face Recognition 14 2.2.1 ArcFace 14 2.2.2 Light CNN 14 2.2.3 VGGFace2 14 2.3 IJB-A Evaluation Method 15 第3章 主要方法 16 3.1 整體網路架構 17 3.2 角度配對模組 18 3.3 多角度正規化設計 20 3.3.1 多角度生成器 21 3.3.2 多角度判別器 23 3.4 正規化選擇 25 3.5 角度配對編碼器設計 26 3.6 角度匹配評估方法設計 27 第4章 實驗設置與分析 28 4.1 資料庫介紹 28 4.1.1 VGGFace2 28 4.1.2 Multi-PIE 29 4.1.3 IJB-A 31 4.2 實驗設置 33 4.2.1 VGGFace2資料劃分、設置 33 4.2.2 Multi-PIE 資料劃分、設置 34 4.2.3 IJB-A 資料劃分、設置 35 4.2.4 效能評估指標 35 4.2.5 實驗設計 36 4.3 實驗結果與分析 39 4.3.1 角度配對模組之設置比較 39 4.3.2 多角度正規化之設置比較 42 4.3.3 正規化選擇之設置比較 43 4.3.4 角度配對編碼器對於角度配對模組之驗證 47 4.3.5 角度匹配評估方法之驗證 48 4.4 與相關文獻之比較 49 第5章 結論與未來研究方向 51 第6章 參考文獻 52

[1] Deng, Jiankang, et al. "Arcface: Additive angular margin loss for deep face recognition." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
[2] Wu, Xiang, et al. "A light cnn for deep face representation with noisy labels." IEEE Transactions on Information Forensics and Security 13.11 (2018): 2884-2896.
[3] Cao, Qiong, et al. "Vggface2: A dataset for recognising faces across pose and age." 2018 13th IEEE international conference on automatic face & gesture recognition (FG 2018). IEEE, 2018.
[4] Liu, Weiyang, et al. "Sphereface: Deep hypersphere embedding for face recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[5] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[6] Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[7] Guo, Yandong, et al. "Ms-celeb-1m: A dataset and benchmark for large-scale face recognition." European conference on computer vision. Springer, Cham, 2016.
[8] Klare, Brendan F., et al. "Pushing the frontiers of unconstrained face detection and recognition: Iarpa janus benchmark a." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[9] Yi, Dong, et al. "Learning face representation from scratch." arXiv preprint arXiv:1411.7923 (2014).
[10] Huang, Gary B., et al. "Labeled faces in the wild: A database forstudying face recognition in unconstrained environments." Workshop on faces in'Real-Life'Images: detection, alignment, and recognition. 2008.
[11] Qian, Yichen, Weihong Deng, and Jiani Hu. "Unsupervised face normalization with extreme pose and expression in the wild." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.
[12] Tran, Luan, Xi Yin, and Xiaoming Liu. "Disentangled representation learning gan for pose-invariant face recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[13] Sankaranarayanan, Swami, et al. "Triplet probabilistic embedding for face verification and clustering." 2016 IEEE 8th international conference on biometrics theory, applications and systems (BTAS). IEEE, 2016.
[14] Sengupta, Soumyadip, et al. "Frontal to profile face verification in the wild." 2016 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2016.
[15] Gross, Ralph, et al. "Multi-pie." Image and vision computing 28.5 (2010): 807-813.
[16] Kemelmacher-Shlizerman, Ira, et al. "The megaface benchmark: 1 million faces for recognition at scale." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[17] Oghina, Andrei, et al. "Predicting imdb movie ratings using social media." European conference on information retrieval. Springer, Berlin, Heidelberg, 2012.
[18] Wang, Fei, et al. "The devil of face recognition is in the noise." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
[19] Huang, Rui, et al. "Beyond face rotation: Global and local perception gan for photorealistic and identity preserving frontal view synthesis." Proceedings of the IEEE international conference on computer vision. 2017.
[20] Yin, Xi, et al. "Towards large-pose face frontalization in the wild." Proceedings of the IEEE international conference on computer vision. 2017.
[21] Blanz, Volker, and Thomas Vetter. "A morphable model for the synthesis of 3D faces." Proceedings of the 26th annual conference on Computer graphics and interactive techniques. 1999.
[22] Zhao, Jian, et al. "Towards pose invariant face recognition in the wild." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[23] Cao, Jie, et al. "Learning a high fidelity pose invariant model for high-resolution face frontalization." arXiv preprint arXiv:1806.08472 (2018).
[24] Hu, Yibo, et al. "Pose-guided photorealistic face rotation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[25] Whitelam, Cameron, et al. "Iarpa janus benchmark-b face dataset." proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2017.
[26] Maze, Brianna, et al. "Iarpa janus benchmark-c: Face dataset and protocol." 2018 International Conference on Biometrics (ICB). IEEE, 2018.
[27] Wolf, Lior, Tal Hassner, and Itay Maoz. "Face recognition in unconstrained videos with matched background similarity." CVPR 2011. IEEE, 2011.
[28] Zheng, Tianyue, and Weihong Deng. "Cross-pose lfw: A database for studying cross-pose face recognition in unconstrained environments." Beijing University of Posts and Telecommunications, Tech. Rep 5 (2018): 7.
[29] Zheng, Tianyue, Weihong Deng, and Jiani Hu. "Cross-age lfw: A database for studying cross-age face recognition in unconstrained environments." arXiv preprint arXiv:1708.08197 (2017).
[30] Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep face recognition." (2015).
[31] Hassner, Tal, et al. "Pooling faces: Template based face recognition with pooled face images." Proceedings of the IEEE conference on computer vision and pattern recognition workshops. 2016.
[32] Bulat, Adrian, and Georgios Tzimiropoulos. "How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks)." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[33] Doosti, Bardia, et al. "Hope-net: A graph-based model for hand-object pose estimation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[34] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International conference on machine learning. PMLR, 2015.
[35] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems 27 (2014).
[36] Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." arXiv preprint arXiv:1704.00028 (2017).
[37] Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).
[38] Karras, Tero, et al. "Analyzing and improving the image quality of stylegan." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[39] Choi, Yunjey, et al. "Stargan v2: Diverse image synthesis for multiple domains." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.

QR CODE