簡易檢索 / 詳目顯示

研究生: 謝宏晟
Hung-Cheng Xie
論文名稱: 結合身份與屬性特徵之人臉辨識
Combination of Identity and Attribute Features for Face Recognition
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 洪一平
Yi-Ping Hung
王鈺強
Yu-Chiang Wang
郭景明
Jing-Ming Guo
劉雲夫
Yun-Fu Liu
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 51
中文關鍵詞: 深度學習人臉辨識人臉屬性
外文關鍵詞: Deep Learning, Face Recognition, Facial Attributes
相關次數: 點閱:602下載:27
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

我們提出以人臉屬性輔助之網路(FAAN)進行人臉辨識,此網路之目的為模擬藉由人臉屬性來描述一張人臉時的情況,例如他的性別、種族、頭髮顏色是什麼等等。我們的方法FAAN是由兩個子網路組成,一個是身份,另一個是屬性,當輸入一張人臉影像時,這兩個子網路會對影像進行身份和屬性的特徵描述。當要進行身份搜查時,屬性特徵會先對註冊組掃描,只留下跟輸入影像有相似的屬性描述之樣本進行辨識;在人臉識別上,我們串接了身份與屬性特徵,因此能更容易區分同一人或不同人。
FAAN是首次結合身份與屬性特徵來進行人臉辨識之方法。在與其他方法的比較上,FAAN在公開測試資料庫IJB-A,YTF,MPIE上都展現了具競爭力之辨識率。


We propose the Facial Attribute Assistant Network (FAAN) for face recognition. The network is developed to imitate the generic description of a human face using facial attributes, for example, gender, ethnicity, hair color, etc. The network is composed of two sub-networks, namely the Identity subnet and the Attribute subnet. Given a face as input, the Identity subnet renders the identity descriptor, and the Attribute subnet renders the attribute features. When searching for a match in a gallery set, the attributes can be exploited for pre-screening. Only the subjects with similar attributes as of the probe are selected for matching by using their identity features. For face verification, we concatenate the identity and attribute features so that the intra/extra pairs can be better distinguished. The FAAN can be the first network that combines the identity and attribute feature for face recognition. Compared with other state-of-theart methods, the FAAN demonstrates a competitive performance on the IJB-A, YTF and MPIE benchmarks.

摘要 II Abstract III 誌謝 IV 目錄 V 圖目錄 VIII 表目錄 XI 第一章 介紹 1 1.1 研究背景和動機 1 1.2 方法概述 2 1.3 論文貢獻 3 1.4 論文架構 4 第二章 文獻回顧 5 2.1 人臉辨識相關文獻 5 2.1.1 DeepID 5 2.1.2 Pose-Aware Model (PAM) 7 2.1.3 DCNNfusion 9 2.1.4 A Discriminative Feature 10 2.1.5 Available Pre-Trained Face Models 11 2.2 人臉屬性特徵相關文獻 13 2.2.1 PANDA 13 2.2.2 Lnets+Anet 14 第三章 主要方法 15 3.1 卷積神經網路(Convolutional Neural Network, CNN) 15 3.1.1 Feedforward Pass前向傳導 16 3.1.2 Backpropagation Pass 反向傳播 16 3.1.3 Convolution Layer 17 3.1.4 Pooling Layer 18 3.1.5 Activation Function 18 3.1.6 Dropout Layer 19 3.2 網絡架構 20 3.2.1 VGG 20 3.2.2 ResNet 22 3.3 回歸式樹狀模型(Regressive Tree Structured Model) 23 3.4 結合身份特徵與屬性特徵辨識 25 第四章 實驗設置與分析 27 4.1 標準資料庫介紹 27 4.1.1 CASIA-WebFace介紹 27 4.1.2 Multi-PIE介紹 28 4.1.3 IJB-A介紹 29 4.1.4 YTF介紹 31 4.1.5 CelebA介紹 31 4.1.6 LFWA介紹 32 4.1.7 MORPH介紹 32 4.1.8 CAS-PEAL介紹 33 4.2 實驗設計 34 4.2.1 身份特徵模型訓練 34 4.2.2 屬性特徵模型訓練 35 4.3 實驗結果與分析 36 4.3.1 身份特徵向量與屬性辨識率 36 4.3.2 不同架構與訓練資料庫比較 38 4.3.3 身份特徵加入屬性特徵比較 41 4.3.4 在測試資料庫上辨識率比較 43 第五章 結論與未來研究方向 46 第六章 參考文獻 47

[1] T. Ahonen, A. Hadid, and M. Pietikainen. Face description with local binary patterns: Application to face recognition. TPAMI, 28(12):2037–2041, 2006.
[2] S. Xie, S. Shan, X. Chen, and J. Chen. Fusing local patterns of gabor magnitude and phase for face recognition. TIP, 19(5):1349–1361, 2010.
[3] A. Asthana, T. K. Marks, M. J. Jones, K. H. Tieu, and M. V. Rohith. Fully automatic pose-invariant face recognition via 3d pose normalization. In ICCV, pages 937–944, 2011.
[4] Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1701-1708).
[5] Sun, Y., Wang, X., & Tang, X. (2015). Deeply learned face representations are sparse, selective, and robust. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2892-2900).
[6] Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. University of Massachusetts, Amherst, Technical Report 07-49, October, 2007.
[7] L. Wolf, T. Hassner and I. Maoz, "Face recognition in unconstrained videos with matched background similarity," CVPR 2011, Providence, RI, 2011, pp. 529-534.
[8] B. F. Klare et al., "Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A," CVPR, Boston, MA, 2015, pp. 1931-1939.
[9] I. Masi, S. Rawls, G. Medioni and P. Natarajan, "Pose-Aware Face Recognition in the Wild," CVPR, Las Vegas, NV, 2016, pp. 4838-4846.
[10] Chen, J. C., Patel, V. M., & Chellappa, R. (2016, March). Unconstrained face verification using deep cnn features. In Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on (pp. 1-9). IEEE.
[11] Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016, October). A discriminative feature learning approach for deep face recognition. In European Conference on Computer Vision (pp. 499-515). Springer International Publishing.
[12] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, “Multi-pie,” Proc. in AFGR, pp. 1,8,17–19, Sept. 2008.
[13] Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. arXiv preprint arXiv:1411.7923.
[14] Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep Face Recognition." BMVC. Vol. 1. No. 3. 2015.
[15] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770-778).
[16] Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3730-3738).
[17] Gao, W., Cao, B., Shan, S., Chen, X., Zhou, D., Zhang, X., & Zhao, D. (2008). The CAS-PEAL large-scale Chinese face database and baseline evaluations. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 38(1), 149-161.
[18] Ricanek, K., & Tesafaye, T. (2006, April). Morph: A longitudinal image database of normal adult age-progression. In Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference on (pp. 341-345). IEEE.
[19] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[20] Sun, Y., Wang, X., & Tang, X. (2014). Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1891-1898).
[21] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[22] Sun, Y., Chen, Y., Wang, X., & Tang, X. (2014). Deep learning face representation by joint identification-verification. In Advances in Neural Information Processing Systems (pp. 1988-1996).
[23] Ding, C., & Tao, D. (2015). Robust face recognition via multimodal deep face representation. Multimedia, IEEE Transactions on, 17(11), 2049-2058.
[24] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In Proc. CVPR, 2015.
[25] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deep-Face: Closing the gap to human-level performance in face verification. In Proc. CVPR, 2014.
[26] Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.: Panda: Pose aligned networks for deep attribute modeling. In: CVPR. (2014)
[27] Kumar, N., Berg, A. C., Belhumeur, P. N., & Nayar, S. K. (2009, September). Attribute and simile classifiers for face verification. In Computer Vision, 2009 IEEE 12th International Conference on (pp. 365-372). IEEE.
[28] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., ... & Darrell, T. (2014, November). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia (pp. 675-678). ACM.
[29] Hsu, G. S., Chang, K. H., & Huang, S. C. (2015). Regressive Tree Structured Model for Facial Landmark Localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3855-3861).
[30] Bourdev, L., Maji, S., & Malik, J. (2011, November). Describing people: A poselet-based approach to attribute classification. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 1543-1550). IEEE.
[31] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627–1645, 2010.
[32] Xiong, X., & Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 532-539).
[33] Zhang, T. (2011). Adaptive forward-backward greedy algorithm for learning sparse representations. Information Theory, IEEE Transactions on, 57(7), 4689-4708.
[34] X. Zhu, D. Ramanan. "Face detection, pose estimation and landmark localization in the wild",Computer Vision and Pattern Recognition (CVPR) Providence, Rhode Island, June 2012.
[35] H. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection. Pattern Analysis and Machine Intelligence, IEEE Trans., 20(1):23–38, 1998.
[36] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan.”Object detection with discriminatively trained partbased models.” IEEE TPAMI, 2009.
[37] A. Li, S. Shan, W. Gao, “Coupled Bias–Variance Tradeoff for Cross-Pose Face Recognition,” in TIP , vol.21, no.1, pp.305-315, Jan. 2012
[38] T. Hassner, S. Harel, E. Paz and R. Enbar, "Effective face frontalization in unconstrained images," CVPR, Boston, MA, 2015, pp. 4295-4304.
[39] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248–255, 2009.
[40] A. Asthana, S. Zafeiriou, S. Cheng and M. Pantic, "Incremental Face Alignment in the Wild," CVPR, OH, 2014, pp. 1859-1866.
[41] D. Chen, X. D. Cao, L. W. Wang, F. Wen, and J. Sun. Bayesian face revisited: A joint formulation. In European Conference on Computer Vision, pages 566–579. 2012.
[42] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[43] Masi, I., Trần, A. T., Hassner, T., Leksut, J. T., & Medioni, G. (2016, October). Do we really need to collect millions of faces for effective face recognition?. In European Conference on Computer Vision (pp. 579-596). Springer International Publishing.
[44] Lewis, J.P., Anjyo, K., Rhee, T., Zhang, M., Pighin, F., Deng, Z.: Practice and Theory of Blendshape Facial Models. In: Eurographics 2014 (2014)
[45] H.-W. Ng, S. Winkler. A data-driven approach to cleaning large face datasets. Proc. IEEE International Conference on Image Processing (ICIP), Paris, France, Oct. 27-30, 2014.
[46] Z. Lei, D. Yi and S. Z. Li, "Learning Stacked Image Descriptor for Face Recognition," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 9, pp. 1685-1696, Sept. 2016.
[47] Xi Yin and Xiaoming Liu. Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition. arXiv:1702.04710, 2017.
[48] Wang, J., Cheng, Y., & Schmidt Feris, R. (2016). Walk and learn: Facial attribute representation learning from egocentric video and contextual data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2295-2304).
[49] K. He, J. Sun, Convolutional neural networks at constrained time cost, in: CVPR, 2015, pp. 5353-5360.
[50] Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
[51] Chen, B. C., Chen, C. S., & Hsu, W. H. (2015). Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Transactions on Multimedia, 17(6), 804-815.
[52] W. Zhang, S. Shan, W. Gao, X. Chen, and H. Zhang. Local gabor binary pattern histogram sequence (LGBPHS): A novel non-statistical model for face representation and recognition. TEEE TCCV, 2005. I, 6, 7

QR CODE