簡易檢索 / 詳目顯示

研究生: 陳彥綸
Yen-Lun Chen
論文名稱: 使用合成圖片結合深度卷積神經網路在影像中人臉種族分類之研究
A Study of Racial Face Recognition Using Synthetic Images with Deep Convolutional Neural Networks
指導教授: 吳怡樂
Yi-Leh Wu
口試委員: 陳建中
Jiann-Jong Chen
唐政元
Zheng-Yuan Tang
何瑁鎧
Maw-Kea Hor
閻立剛
Li-Gang Yan
吳怡樂
Yi-Leh Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 37
中文關鍵詞: Caffe人種辨識合成圖片深度捲積神經網路深度學習
外文關鍵詞: Caffe, Racial Face Classification, Convolution Neural Network, Deep Learning, Synthetic Image
相關次數: 點閱:380下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在過去,分辨人種的不同通常是使用面部特徵擷取和淺層學習像是decision trees, SVM,Naive Bayes 等等。深度學習通常都需要花費大量的時間來訓練。但隨著硬體的進步以及新的演算法不斷提出,訓練時間的問題漸漸緩解。深度卷積神經網路在圖片的分類上有非常良好的效果。在此篇論文中我們利用深度卷積神經網路來解決人種分類的問題。因為卷積神經網絡通常需要大量的數據用於良好表現的訓練,而我們無法獲得這樣的真實種族面孔的訓練集。對於小的種族面孔資料集,本研究提出將合成人臉圖像納入我們的訓練集來充分增加訓練集的大小。據我們所知,這項研究是第一個提出合成種族面孔去訓練深層卷積神經網絡來分類真實的種族面孔的研究。我們比較在訓練集中僅使用合成面部圖像和混合合成和真實面部圖像。我們的實驗表明,只有真實的面部圖像(2500張圖像)的訓練可以在三種不同種族的面部分類中達到91.25%的準確度。然而,使用2500個真實面部圖像和15,000個合成面部圖像的混合進行訓練時的分類可以進一步提高至98.5%的精度。


    In the past, people usually employ the facial feature extraction and shallow learners such as decision trees, SVM, Naive Bayes, etc. to classify faces of different races. Deep learning usually takes lots of time to train. But with the advances in hardware and new algorithm proposed, the training time problem is gradually alleviated. The deep convolutional neural networks have good effect on images classification. In this paper, we use the deep convolutional neural networks to try to solve the problem of classification faces of different racial origin. Because the convolutional neural networks usually require a huge amount of data for training for good performance, such training set of real racial faces is not available to us. As a result of small set of real racial faces, this study proposes to incorporate synthetic facial images in our training set to sufficiently increase the size of the training set. To the best of our knowledge, this study is the first to propose to incorporate synthetic racial faces to train a deep convolutional neural network to classify real racial faces. We compare the performance of only employ synthetic facial images and mixtures of synthetic and real facial images in the training set. Our experiments show that training with only the real facial images (2,500 images) can achieve 91.25% accuracy in classifying faces of three different race origins. However, the classification when training with a mixture of 2,500 real facial images and 15,000 synthetic facial images can be further improved to 98.5% in accuracy.

    論文摘要.....................................................I Abstract....................................................II Contents....................................................III LIST OF FIGURES.............................................IV LIST OF TABLES..............................................VI Chapter 1. Introduction.....................................1 Chapter 2. Deep Learning Model..............................5 Chapter 3. Caffe and Original model.........................7 3.1 Caffe...................................................7 3.2 Cnn and Original Model..................................8 Chapter 4. Experiment.......................................11 4.1 Synthetic face dataset and Real face dataset............11 4.2 Using Real face dataset on Alex Net Model...............15 4.3 Using Synthetic face dataset on Alex Net Model..........16 4.4 Using Mixture dataset on Alex Net Model.................22 Chapter 5. Conclusions and Future work......................24 References..................................................25 Appendix A..................................................27 Appendix B..................................................30

    [1] Hosoi S., Erina T., and Masato K., “Ethnicity estimation with facial images,” in Proc. IEEE 6th Int. Conf. Autom. Face Gesture Recog., , pp. 195–200, 2004, pp.
    [2] Lin H., Lu H. C., and Zhang L. H., “A new automatic recognition system of gender, age and ethnicity,” in Proc. World Congr. Intell. Control Autom., pp. 9988–9991, 2006.
    [3] Lyle J. R., Miller P. E., Pundlik S. J., and Woodard D. L., “Soft biometric classification using periocular region features,” in Proc. IEEE 4th Int. Conf. Biometrics: Theory Appl. Syst., pp. 1–7, 2010.
    [4] Demirkus M., Garg K., and Guler S., “Automated person categorization for video surveillance using soft biometrics,” in Proc. SPIE, Biometric Technol. Human Identification VII, pp. 76670P, 2010.
    [5] Kumar N., Berg A., Belhumeur P., and Nayar S., “Describable visual attributes for face verification and image search,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 10, pp. 1962–1977, Oct. 2011.
    [6] Xie Y., Luu K., and Savvides M., “A robust approach to facial ethnicity classification on large scale face databases,” in Proc. IEEE Int. Conf. Biometrics: Theory, Appl. Syst., pp. 143– 149, 2012.
    [7] Klare B. F., Burge M., Klontz J., Bruegge R. W. V., and Jain A. K., “Face recognition performance: Role of demographic information,” IEEE Trans. Inform. Forensics Security, vol. 7, no. 6, pp. 1789–1801, Dec. 2012.
    [8] Huang, D., Ding, H., Wang, C., Wang, Y., Zhang, G., Chen, L.: Local circular patterns for multi-modal facial gender and ethnicity classification. Image Vis. Comput. 32(12), 1181–1193, 2014.
    [9] Wang W., He F., Zhao Q.,: Facial Ethnicity Classification with Deep
    Convolutional Neural Networks, 2016.
    [10] GPU development in recent years, “http://bkultrasound.com/blog/the-next-generation-of-ultrasound-technology”,
    Referenced on May 15 th, 2015
    [11] Bengio, Y. Learning deep architectures for AI. Foundations and trends® in Machine Learning, 2(1), 1-127, 2009.
    [12] Lee, T. H., A Study of General Fine-Grained Classification with Deep Convolutional Neural Networks, National Taiwan University of Science and Technology Department of Computer Science and Information Engineering, 2015
    [13] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324, 1998.
    [14] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., ... & Darrell, T.. Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the ACM International Conference on Multimedia (pp. 675-678). ACM, 2014.
    [15] Typical CNN architecture,
    “https://en.wikipedia.org/wiki/Convolutional_neural_network”, Referenced on May 31 th, 2017
    [16] Krizhevsky, A., Sutskever, I., & Hinton, G. E. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105), 2012.
    [17] Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Berg, A. C. Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211-252, 2015.
    [18] Nair, V., & Hinton, G. E. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10) (pp. 807-814), 2010.
    [19] Boureau, Y. L., Bach, F., LeCun, Y., & Ponce, J. Learning mid-level features for recognition. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (pp. 2559-2566). IEEE, 2010.
    [20] Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. R. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012.
    [21] Michael J. Tarr (Center for the Neural Basis of Cognition and Department of Psychology, Carnegie Mellon University, http://www.tarrlab.org/. Funding provided by NSF award 0339122.),
    “http://wiki.cnbc.cmu.edu/Face_Place”, Referenced on May 31 th, 2017.
    [22] FaceGen Modeller Demo,
    “https://facegen.com”, Referenced on May 31 th, 2017.

    無法下載圖示 全文公開日期 2018/07/18 (校內網路)
    全文公開日期 2027/07/18 (校外網路)
    全文公開日期 2027/07/18 (國家圖書館:臺灣博碩士論文系統)
    QR CODE