研究生: |
謝宏晟 Hung-Cheng Xie |
---|---|
論文名稱: |
結合身份與屬性特徵之人臉辨識 Combination of Identity and Attribute Features for Face Recognition |
指導教授: |
徐繼聖
Gee-Sern Hsu |
口試委員: |
洪一平
Yi-Ping Hung 王鈺強 Yu-Chiang Wang 郭景明 Jing-Ming Guo 劉雲夫 Yun-Fu Liu |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 51 |
中文關鍵詞: | 深度學習 、人臉辨識 、人臉屬性 |
外文關鍵詞: | Deep Learning, Face Recognition, Facial Attributes |
相關次數: | 點閱:602 下載:27 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我們提出以人臉屬性輔助之網路(FAAN)進行人臉辨識,此網路之目的為模擬藉由人臉屬性來描述一張人臉時的情況,例如他的性別、種族、頭髮顏色是什麼等等。我們的方法FAAN是由兩個子網路組成,一個是身份,另一個是屬性,當輸入一張人臉影像時,這兩個子網路會對影像進行身份和屬性的特徵描述。當要進行身份搜查時,屬性特徵會先對註冊組掃描,只留下跟輸入影像有相似的屬性描述之樣本進行辨識;在人臉識別上,我們串接了身份與屬性特徵,因此能更容易區分同一人或不同人。
FAAN是首次結合身份與屬性特徵來進行人臉辨識之方法。在與其他方法的比較上,FAAN在公開測試資料庫IJB-A,YTF,MPIE上都展現了具競爭力之辨識率。
We propose the Facial Attribute Assistant Network (FAAN) for face recognition. The network is developed to imitate the generic description of a human face using facial attributes, for example, gender, ethnicity, hair color, etc. The network is composed of two sub-networks, namely the Identity subnet and the Attribute subnet. Given a face as input, the Identity subnet renders the identity descriptor, and the Attribute subnet renders the attribute features. When searching for a match in a gallery set, the attributes can be exploited for pre-screening. Only the subjects with similar attributes as of the probe are selected for matching by using their identity features. For face verification, we concatenate the identity and attribute features so that the intra/extra pairs can be better distinguished. The FAAN can be the first network that combines the identity and attribute feature for face recognition. Compared with other state-of-theart methods, the FAAN demonstrates a competitive performance on the IJB-A, YTF and MPIE benchmarks.
[1] T. Ahonen, A. Hadid, and M. Pietikainen. Face description with local binary patterns: Application to face recognition. TPAMI, 28(12):2037–2041, 2006.
[2] S. Xie, S. Shan, X. Chen, and J. Chen. Fusing local patterns of gabor magnitude and phase for face recognition. TIP, 19(5):1349–1361, 2010.
[3] A. Asthana, T. K. Marks, M. J. Jones, K. H. Tieu, and M. V. Rohith. Fully automatic pose-invariant face recognition via 3d pose normalization. In ICCV, pages 937–944, 2011.
[4] Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1701-1708).
[5] Sun, Y., Wang, X., & Tang, X. (2015). Deeply learned face representations are sparse, selective, and robust. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2892-2900).
[6] Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. University of Massachusetts, Amherst, Technical Report 07-49, October, 2007.
[7] L. Wolf, T. Hassner and I. Maoz, "Face recognition in unconstrained videos with matched background similarity," CVPR 2011, Providence, RI, 2011, pp. 529-534.
[8] B. F. Klare et al., "Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A," CVPR, Boston, MA, 2015, pp. 1931-1939.
[9] I. Masi, S. Rawls, G. Medioni and P. Natarajan, "Pose-Aware Face Recognition in the Wild," CVPR, Las Vegas, NV, 2016, pp. 4838-4846.
[10] Chen, J. C., Patel, V. M., & Chellappa, R. (2016, March). Unconstrained face verification using deep cnn features. In Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on (pp. 1-9). IEEE.
[11] Wen, Y., Zhang, K., Li, Z., & Qiao, Y. (2016, October). A discriminative feature learning approach for deep face recognition. In European Conference on Computer Vision (pp. 499-515). Springer International Publishing.
[12] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, “Multi-pie,” Proc. in AFGR, pp. 1,8,17–19, Sept. 2008.
[13] Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. arXiv preprint arXiv:1411.7923.
[14] Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep Face Recognition." BMVC. Vol. 1. No. 3. 2015.
[15] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770-778).
[16] Liu, Z., Luo, P., Wang, X., & Tang, X. (2015). Deep learning face attributes in the wild. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3730-3738).
[17] Gao, W., Cao, B., Shan, S., Chen, X., Zhou, D., Zhang, X., & Zhao, D. (2008). The CAS-PEAL large-scale Chinese face database and baseline evaluations. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, 38(1), 149-161.
[18] Ricanek, K., & Tesafaye, T. (2006, April). Morph: A longitudinal image database of normal adult age-progression. In Automatic Face and Gesture Recognition, 2006. FGR 2006. 7th International Conference on (pp. 341-345). IEEE.
[19] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[20] Sun, Y., Wang, X., & Tang, X. (2014). Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1891-1898).
[21] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[22] Sun, Y., Chen, Y., Wang, X., & Tang, X. (2014). Deep learning face representation by joint identification-verification. In Advances in Neural Information Processing Systems (pp. 1988-1996).
[23] Ding, C., & Tao, D. (2015). Robust face recognition via multimodal deep face representation. Multimedia, IEEE Transactions on, 17(11), 2049-2058.
[24] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In Proc. CVPR, 2015.
[25] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deep-Face: Closing the gap to human-level performance in face verification. In Proc. CVPR, 2014.
[26] Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.: Panda: Pose aligned networks for deep attribute modeling. In: CVPR. (2014)
[27] Kumar, N., Berg, A. C., Belhumeur, P. N., & Nayar, S. K. (2009, September). Attribute and simile classifiers for face verification. In Computer Vision, 2009 IEEE 12th International Conference on (pp. 365-372). IEEE.
[28] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., ... & Darrell, T. (2014, November). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia (pp. 675-678). ACM.
[29] Hsu, G. S., Chang, K. H., & Huang, S. C. (2015). Regressive Tree Structured Model for Facial Landmark Localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3855-3861).
[30] Bourdev, L., Maji, S., & Malik, J. (2011, November). Describing people: A poselet-based approach to attribute classification. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 1543-1550). IEEE.
[31] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627–1645, 2010.
[32] Xiong, X., & Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 532-539).
[33] Zhang, T. (2011). Adaptive forward-backward greedy algorithm for learning sparse representations. Information Theory, IEEE Transactions on, 57(7), 4689-4708.
[34] X. Zhu, D. Ramanan. "Face detection, pose estimation and landmark localization in the wild",Computer Vision and Pattern Recognition (CVPR) Providence, Rhode Island, June 2012.
[35] H. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection. Pattern Analysis and Machine Intelligence, IEEE Trans., 20(1):23–38, 1998.
[36] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan.”Object detection with discriminatively trained partbased models.” IEEE TPAMI, 2009.
[37] A. Li, S. Shan, W. Gao, “Coupled Bias–Variance Tradeoff for Cross-Pose Face Recognition,” in TIP , vol.21, no.1, pp.305-315, Jan. 2012
[38] T. Hassner, S. Harel, E. Paz and R. Enbar, "Effective face frontalization in unconstrained images," CVPR, Boston, MA, 2015, pp. 4295-4304.
[39] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A large-scale hierarchical image database. In CVPR, pages 248–255, 2009.
[40] A. Asthana, S. Zafeiriou, S. Cheng and M. Pantic, "Incremental Face Alignment in the Wild," CVPR, OH, 2014, pp. 1859-1866.
[41] D. Chen, X. D. Cao, L. W. Wang, F. Wen, and J. Sun. Bayesian face revisited: A joint formulation. In European Conference on Computer Vision, pages 566–579. 2012.
[42] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[43] Masi, I., Trần, A. T., Hassner, T., Leksut, J. T., & Medioni, G. (2016, October). Do we really need to collect millions of faces for effective face recognition?. In European Conference on Computer Vision (pp. 579-596). Springer International Publishing.
[44] Lewis, J.P., Anjyo, K., Rhee, T., Zhang, M., Pighin, F., Deng, Z.: Practice and Theory of Blendshape Facial Models. In: Eurographics 2014 (2014)
[45] H.-W. Ng, S. Winkler. A data-driven approach to cleaning large face datasets. Proc. IEEE International Conference on Image Processing (ICIP), Paris, France, Oct. 27-30, 2014.
[46] Z. Lei, D. Yi and S. Z. Li, "Learning Stacked Image Descriptor for Face Recognition," in IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 9, pp. 1685-1696, Sept. 2016.
[47] Xi Yin and Xiaoming Liu. Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition. arXiv:1702.04710, 2017.
[48] Wang, J., Cheng, Y., & Schmidt Feris, R. (2016). Walk and learn: Facial attribute representation learning from egocentric video and contextual data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2295-2304).
[49] K. He, J. Sun, Convolutional neural networks at constrained time cost, in: CVPR, 2015, pp. 5353-5360.
[50] Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
[51] Chen, B. C., Chen, C. S., & Hsu, W. H. (2015). Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset. IEEE Transactions on Multimedia, 17(6), 804-815.
[52] W. Zhang, S. Shan, W. Gao, X. Chen, and H. Zhang. Local gabor binary pattern histogram sequence (LGBPHS): A novel non-statistical model for face representation and recognition. TEEE TCCV, 2005. I, 6, 7