簡易檢索 / 詳目顯示

研究生: 陳宥然
You-Ran Chen
論文名稱: 基於深度學習之人臉特徵與屬性識別
Deep Learning for Facial Attribute Identification
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 亞魯
ArulMurugan Ambikapathi
陳彥呈
Yan-Cheng Chen
蘇順豐
Shun-Feng Su
鍾國亮
Kuo-Liang Chung
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 72
中文關鍵詞: 深度學習屬性識別人臉識別
外文關鍵詞: deep learning, face verification, attribute identification
相關次數: 點閱:1655下載:115
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本文透過深度學習擷取40餘種特徵屬性,並透過部分特徵屬性提升人臉識別率,有別於大多數利用深度學習解決人臉識別的方法,均著重於區塊處理與卷積神經網路(CNN)之設定調整,本文將人臉特徵與屬性特徵擷取出進行人臉識別的探討,並深入探討不同特徵屬性展現之神經元反應模式,由此分類出較易與較難辨識之特徵屬性,並探討吸引力這項人臉屬性是否被表情、髮型、裝扮等等所影響,我們的目標為提出如何使我們的臉部更具有吸引力的一個建議。
而CNN在電腦視覺中展現出亮眼的表現,但並無人比較CNN與人工定義特徵效能之間差距,在此論文最後將會對FERET、M-pie人臉資料庫、Morph年齡資料庫和CKe表情資料庫進行測試與比較。


Different from most deep learning approaches for face recognition which rely on different network settings and partitions of face images for performance improvement, we extract facial attributes and show that some of the facial attributes can be exploited to boost the face verification performance. We train an off-the-shelf Convolutional Neural Network (CNN) to extract facial features, and use it to identify more than 40 facial attributes. We rank the level of difficulty for identifying each attribute, making a connection to human visual perception on identifying different attributes. With some attribute features added on to facial features, we improve the face verification rate on the LFW benchmark. We also discover the interactions between attributes, and show that facial attractiveness can be altered by expression, makeup and other attributes; and the alteration can be described in a quantitative way. To better understand how well deep learning can do for face recognition than hand-crafted features and methods, we compare the performance of the CNN with state-of-the-art “shallow” (i.e., not “deep” learning) approaches on face recognition, age estimation and expression identification on common benchmarks.

摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 VII 表目錄 IX 第一章 介紹 1 1.1 研究背景和動機 1 1.2 方法概述 2 1.3 論文貢獻 3 1.4 論文架構 4 第二章 文獻回顧 5 2.1 人臉屬性相關文獻 5 2.1.1 Poselet Input Patches 5 2.1.2 Localization network 6 2.2 人臉識別相關文獻 7 2.2.1 Purity data 8 2.2.2 Multi-model 8 2.3 傳統特徵相關文獻 12 2.3.1 MDML-DCPs 12 2.3.2 年齡仿生特徵 14 第三章 主要方法 16 3.1 卷積類神經網路 (Convolutional Neural Network) 16 3.1.1 Feedforward Pass前向傳播 17 3.1.2 Backpropagation Pass 反向傳播 18 3.1.3 Convolution layer 22 3.1.4 Pooling layer 23 3.1.5 dropout layer 23 3.2 網絡架構 24 3.2.1 ConvNet 24 3.2.2 DeepID 25 3.2.3 VGG 26 3.3 回歸式樹狀模型(Regressive Tree Structured Model) 28 3.4 訓練與特徵擷取 29 第四章 實驗設置與分析 32 4.1 標準資料庫介紹 32 4.1.1 CASIA-WebFace介紹 32 4.1.2 LFW介紹 33 4.1.3 CelebA 介紹 34 4.1.4 FERET介紹 35 4.1.5 M-PIE介紹 36 4.1.6 Morph介紹 37 4.1.7 CKe介紹 39 4.2 樣本之規格 40 4.3 實驗設計 41 4.4 實驗結果與分析 42 4.4.1 最佳的影像前處理 42 4.4.2 人臉屬性與神經元激發 43 4.4.3 人臉識別 49 4.4.4 與傳統特徵比較 50 第五章 結論與未來研究方向 56 第六章 參考文獻 57

[1] Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. (2012). Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580.
[2] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[3] Sun, Y., Wang, X., & Tang, X. (2014). Deep learning face representation from predicting 10,000 classes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1891-1898).
[4] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[5] Chen, D., Cao, X., Wang, L., Wen, F., & Sun, J. (2012). Bayesian face revisited: A joint formulation. In Computer Vision–ECCV 2012 (pp. 566-579). Springer Berlin Heidelberg.
[6] Sun, Y., Chen, Y., Wang, X., & Tang, X. (2014). Deep learning face representation by joint identification-verification. In Advances in Neural Information Processing Systems (pp. 1988-1996).
[7] Sun, Y., Wang, X., & Tang, X. (2015). Deeply learned face representations are sparse, selective, and robust. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2892-2900).
[8] Ding, C., & Tao, D. (2015). Robust face recognition via multimodal deep face representation. Multimedia, IEEE Transactions on, 17(11), 2049-2058.
[9] Chang, C. C., & Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST),2(3), 27.
[10] Yi, D., Lei, Z., Liao, S., & Li, S. Z. (2014). Learning face representation from scratch. arXiv preprint arXiv:1411.7923.
[11] Gary B. Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. University of Massachusetts, Amherst, Technical Report 07-49, October, 2007.
[12] P. Phillips, H. Moon, P. Rizvi, and P. Rauss, ''The feret evaluation method for face recognition algorithms,'' PAMI, IEEE Trans. on, vol. 22, p. 1090–1104, Oct. 2000.
[13] Wu, X., He, R., & Sun, Z. (2015). A Lightened CNN for Deep Face Representation. arXiv preprint arXiv:1511.02683.
[14] F. Schroff, D. Kalenichenko, and J. Philbin. Facenet: A unified embedding for face recognition and clustering. In Proc. CVPR, 2015.
[15] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Deep-Face: Closing the gap to human-level performance in face verification. In Proc. CVPR, 2014.
[16] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf. Web-scale training for face identification. In Proc. CVPR, 2015.
[17] Kumar, N., Belhumeur, P., Nayar, S.: Facetracer: A search engine for large collections of images with faces. (2008)
[18] Zhang, N., Paluri, M., Ranzato, M., Darrell, T., Bourdev, L.: Panda: Pose aligned networks for deep attribute modeling. In: CVPR. (2014)
[19] Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: ICCV. (2015)
[20] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1026-1034).
[21] Kumar, N., Berg, A. C., Belhumeur, P. N., & Nayar, S. K. (2009, September). Attribute and simile classifiers for face verification. In Computer Vision, 2009 IEEE 12th International Conference on (pp. 365-372). IEEE.
[22] Kannala, J., & Rahtu, E. (2012, November). Bsif: Binarized statistical image features. In Pattern Recognition (ICPR), 2012 21st International Conference on (pp. 1363-1366). IEEE.
[23] Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E., & Darrell, T. (2013). Decaf: A deep convolutional activation feature for generic visual recognition. arXiv preprint arXiv:1310.1531.
[24] Razavian, A., Azizpour, H., Sullivan, J., & Carlsson, S. (2014). CNN features off-the-shelf: an astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 806-813).
[25] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., ... & Darrell, T. (2014, November). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia (pp. 675-678). ACM.
[26] Hsu, G. S., Chang, K. H., & Huang, S. C. (2015). Regressive Tree Structured Model for Facial Landmark Localization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 3855-3861).
[27] Bourdev, L., Maji, S., & Malik, J. (2011, November). Describing people: A poselet-based approach to attribute classification. In Computer Vision (ICCV), 2011 IEEE International Conference on (pp. 1543-1550). IEEE.
[28] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627–1645, 2010.
[29] Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015). Deep face recognition. Proceedings of the British Machine Vision.
[30] Xiong, X., & Torre, F. (2013). Supervised descent method and its applications to face alignment. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 532-539).
[31] Zhang, T. (2011). Adaptive forward-backward greedy algorithm for learning sparse representations. Information Theory, IEEE Transactions on, 57(7), 4689-4708.
[32] Chatfield, K. and Simonyan, K. and Vedaldi, A. and Zisserman, A (2014). Return of the Devil in the Details: Delving Deep into Convolutional Nets. British Machine Vision Conference.
[33] Ding, C., Choi, J., Tao, D., & Davis, L. (2014). Multi-directional multi-level dual-cross patterns for robust face recognition, CVPR.
[34] MORPH Face Database, http://faceaginggroup.com/, 2010.
[35] T. Kanade, J. Cohn, and Y. Tian, ``Comprehensive database for facial expressionanalysis,'' in IEEE Proc. Int. Conf. Automatic Face and Gesture Recognition, p. 46–53, 2000.
[36] X. Zhu, D. Ramanan. "Face detection, pose estimation and landmark localization in the wild",Computer Vision and Pattern Recognition (CVPR) Providence, Rhode Island, June 2012.
[37] H. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection. Pattern Analysis and Machine Intelligence, IEEE Trans., 20(1):23–38, 1998.
[38] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan.”Object detection with discriminatively trained partbased models.” IEEE TPAMI, 2009.
[39] H. Han, C. Otto, and A. Jain. Age estimation from face images: Human vs. machine performance. In ICB, pages 1–8, 2013
[40] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, “Multi-pie,” Proc. in AFGR, pp. 1,8,17–19, Sept. 2008.
[41] A. Li, S. Shan, X. Chen, W. Gao, “Cross-pose face recognition based on partial least squares”, Pattern Recognition Letters, Volume 32, Issue 15, 1 November 2011, Pages 1948-1955
[42] A. Li, S. Shan, W. Gao, “Coupled Bias–Variance Tradeoff for Cross-Pose Face Recognition,” in TIP , vol.21, no.1, pp.305-315, Jan. 2012
[43] Roy, K., Kamel, M.: Facial expression recognition using game theory. In: Artificial Neural Networks in Pattern Recognition, pp. 139–150 (2012)
[44] Jain, S., Hu, C., Aggarwal, J.: Facial expression recognition with temporal modeling of shapes. In: IEEEInternationalConference on Computer Vision Workshops (ICCV Workshops), pp. 1642–1649 (2011)
[45] Ramirez Rivera, A., Rojas Castillo, J., Chae, O.: Local directional number pattern for face analysis: face and expression recognition. IEEE Trans. Image Process. 22(5), 1740–1752 (2013)
[46] Zhang, X., Mahoor, M. H., & Mavadati, S. M. (2015). Facial expression recognition using {l} _ {p}-norm MKL multiclass-SVM. Machine Vision and Applications, 26(4), 467-483.
[47] Chang, Kuang-Yu, Chu-Song Chen, and Yi-Ping Hung. "Ordinal hyperplanes ranker with cost sensitivities for age estimation." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
[48] Lu, Jiwen, and Yap-Peng Tan. "Ordinary preserving manifold analysis for human age and head pose estimation." Human-Machine Systems, IEEE Transactions on 43.2 (2013): 249-258.
[49] Geng, Xin, Chao Yin, and Zhi-Hua Zhou. "Facial age estimation by learning from label distributions." Pattern Analysis and Machine Intelligence, IEEE Transactions on 35.10 (2013): 2401-2412.
[50] H. Han, C. Otto, and A. Jain. Age estimation from face images: Human vs. machine performance. In ICB, pages 1–8, 2013.
[51] H. Han, C. Otto, X. Liu, and A. Jain. Demographic estimation from face images: Human vs. machine performance. Pattern Analysis and Machine Intelligence, IEEE, 37:1148– 1161, 2015.
[52] J. Li and Y. Zhang. Learning surf cascade for fast and accurate object detection. In CVPR, pages 3468–3475, 2013.
[53] N. Kumar, P. Belhumeur, and S. Nayar. Facetracer: A search engine for large collections of images with faces. In ECCV, pages 340–353. 2008.

QR CODE