簡易檢索 / 詳目顯示

研究生: 林律恩
Lu-En Lin
論文名稱: 基於臉部特徵遮蔽式資料擴增之年齡及性別估測
Data Augmentation with Occluded Facial Features for Age and Gender Estimation
指導教授: 林昌鴻
Chang-Hong Lin
口試委員: 林昌鴻
Chang-Hong Lin
陳維美
Wei-Mei Chen
林敬舜
Ching-Shun Lin
王煥宗
Huan-Chun Wang
陳永耀
Yung-Yao Chen
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 英文
論文頁數: 67
中文關鍵詞: 年齡預測性別預測深度學習卷積神經網路資料擴增
外文關鍵詞: Gender Classification, Age Classification, Deep Learning, Convolution Neural Networks, Data Augmentation
相關次數: 點閱:244下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 基於人臉分析廣泛的應用一直是熱門的研究項目。商業用途、娛樂用途、監控、及人機互動等等,皆需透過人臉分析來提供必要的資訊。由於近日深度學習在各領域的崛起,人臉分析的研究能夠朝著更加複雜和實際的使用情境發展。在年齡和性別的估測中,利用深度學習網路的應用,比較起以往機器學習的方式大幅提升了準確度。但年齡和性別估測在實際生活中的應用還有不少的進步空間。因此,本論文提出了一項資料擴增的方法,將深度網路的訓練資料進行處理,在臉部特徵上模擬現實生活中會遇到的難題,為深度網路提供更多的訓練資料,並提升深度網路的強健性和泛化程度。本方法利用了三種簡單的影像處理技巧來對輸入圖片的臉部特徵進行遮蔽,這三種技巧分別為遮罩(Blackout)、隨機亮度(Random Brightness)、及模糊(Blur)。此論文也提出一個稍微修改的交叉熵損失函數(Cross-entropy loss)。當年齡估測的結果和標準答案相差為一時,將給予年齡的損失函數較少的懲罰。為了驗證本方法的有效性,我們將提出的資料擴增方法實現在兩種不同的卷積神經網路(Convolutional Neural Network)¬–稍微修改的AdienceNet及稍微修改的VGG16,以進行年齡及性別預測。最終結果顯示出,在年齡預測上,我們的資料擴增方法及稍微修改的交叉熵損失函數能夠分別對稍微修改的AdienceNet及稍微修改的VGG16網路提升6.62%及6.53%的準確度;並在性別預測上,分別提升6.20%及6.31%的準確度。


    Facial analysis tasks have been a very hot topic over the years for its broad varieties of applications, such as human-machine interaction, commercial uses, entertainment, and surveillance, etc. With the rise of deep learning, these tasks were able to solve more difficult and practical problems. Even though the task of age and gender estimation with deep learning methods achieved considerable improvements compared to the traditional machine learning based methods, the results are still far from satisfying the need for real-life applications. In this thesis, a data augmentation method that stimulates real-life challenges on the main feature of the human face is proposed. With the proposed method, we improve the generalization and robustness of the network by generating more variety of training samples. The proposed method, Feature occlusion, used three simple occlusion techniques, Blackout, Random Brightness, and Blur to stimulate different challenges that could happen in real-life situations. We also proposed a modified cross-entropy loss that gives less penalty to the age predictions that lands on the adjacent classes of the ground truth class. We verify the effectiveness of our proposed method by implementing the augmentation method and modified cross-entropy loss on two different convolution neural networks (CNNs), the slightly modified AdienceNet and the slightly modified VGG16, to perform age and gender classification. The proposed augmentation system improves age and gender classification accuracy of the slightly modified AdienceNet network by 6.62% and 6.53% on the Adience dataset, respectively. The proposed augmentation system also improves the age and gender classification accuracy of the slightly modified VGG16 network by 6.20% and 6.31% on the Adience dataset, respectively.

    摘要 I ABSTRACT II LIST OF CONTENTS III LIST OF FIGURES V LIST OF TABLES VII CHAPTER 1 INTRODUCTIONS 1 1.1 Motivation 1 1.2 Contributions 3 1.3 Thesis Organization 4 CHAPTER 2 RELATED WORKS 5 2.1 Traditional Age and Gender Estimation 5 2.2 Deep Learning Methods for Age and Gender Estimation 6 2.3 Training Strategies 7 CHAPTER 3 PROPOSED METHOD 9 3.1 Feature Detection 10 3.1.1 Face detection 10 3.1.2 Face alignment 11 3.2 Feature Occlusion 14 3.2.1 Region Selection 14 3.2.2 Random Brightness 17 3.2.3 Blackout 19 3.2.4 Blur 20 3.3 Data Augmentation 22 3.4 CNN 23 3.4.1 Modified AdienceNet [17] 23 3.4.2 Modified VGG16 [18] 25 3.5 Training Details 27 3.5.1 Initialization 28 3.5.2 Loss Function 28 3.5.3 Optimizer 30 3.5.4 Training Method 30 CHAPTER 4 EXPERIMENTAL RESULTS 31 4.1 Experimental Environment 31 4.2 Adience Dataset [7] 32 4.3 Performance Evaluation 35 4.3.1 Age and Gender Classification 35 4.3.1.1 Modified Cross-Entropy Loss 35 4.3.1.2 Feature Occlusion 37 4.3.2 Gender Classification 41 4.3.3 Age Classification 42 4.4 Mask Evaluation 46 CHAPTER 5 CONCLUSIONS AND FUTURE WORKS 48 5.1 Conclusions 48 5.2 Future Works 50 REFERENCES 51

    [1]
    A.K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 4-20, 2004.
    [2]
    R. Maldonado, P. Tansuhaj, and D.D. Muehling, “The impact of gender on ad processing: A social identity perspective,” Academy of Marketing Science Review, vol. 3, no. 3, pp. 1-15, 2003.
    [3]
    K. Luu, K. Ricanek, T.D. Bui, and C.Y. Suen, “Age estimation using active appearance models and support vector machine regression,” in 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems (BTAS 2009). IEEE, pp. 1-5, 2009.
    [4]
    A. Gunay, and V.V. Nabiyev, “Automatic age classification with LBP,” in 2008 23rd International Symposium on Computer and Information Sciences (ISCIS 2008). IEEE, pp. 1-4, 2008.
    [5]
    M. Hu, Y. Zheng, F. Ren and H. Jiang, “Age estimation and gender classification of facial images based on Local Directional Pattern,” in 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems (ICCCIS 2014). IEEE, pp. 103-107, 2014.
    [6]
    C. Shan, “Learning local features for age estimation on real-life faces,” in Proceedings of the 1st ACM International Workshop on Multimodal Pervasive Video Analysis (MPVA 2010), pp.23-28, 2010.
    [7]
    E. Eidinger, R. Enbar and T. Hassner, “Age and gender estimation of unfiltered faces,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 12, pp. 2170-2179, 2014.
    [8]
    R. Ranjan, S. Sankaranarayanan, C.D. Castillo and R. Chellappa, “An all-in-one convolutional neural network for face analysis,” in 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE, pp. 17-24, 2017.
    [9]
    P. Rodríguez, G. Curcurull, J.M. Gonfaus, F.X. Roca and J. Gonzalez, “Age and gender recognition in the wild with deep attention,” in Pattern Recognition, vol. 72 pp. 563-571, 2017.
    [10]
    R. Rothe, R. Timofte, and L. V. Gool, “Dex: Deep expectation of apparent age from a single image,” in Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV 2015), pp. 10-15, 2015.
    [11]
    G. Ozbulak, Y. Aytar, and H.K. Ekenel, “How transferable are CNN-based features for age and gender classification?,” in 2016 International Conference of the Biometrics Special Interest Group (BIOSIG 2016). IEEE, pp. 1-6, 2016.
    [12]
    E. Hoffer, R. Banner, I. Golan, and D. Soudry, “Norm matters: efficient and accurate normalization schemes in deep networks,” Advances in Neural Information Processing Systems (NIPS 2018), pp. 2160-2170, 2018.
    [13]
    N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no.1, pp. 1929-1958, 2014.
    [14]
    B. Zoph, E.D. Cubuk, G. Ghiasi, T.Y. Lin, J. Shlens, and Q.V. Le, “Learning data augmentation strategies for object detection,” arXiv preprint arXiv:1906.11172, 2019.
    [15]
    J. Wang and L. Perez, “The effectiveness of data augmentation in image classification using deep learning,” arXiv preprint arXiv:1712.04621, 2017.
    [16]
    J. Lemley, S. Bazrafkan, and P. Corcoran, “Smart Augmentation Learning an Optimal Data Augmentation Strategy,” in IEEE Access, vol. 5, pp. 5858-5869, 2017.
    [17]
    G. Levi and T. Hassncer, “Age and gender classification using convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops (CVPR 2015), pp. 2622-2629, 2015.
    [18]
    K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
    [19]
    J. Tapia, and C. Perez, “Gender classification based on fusion of different spatial scale features selected by mutual information from histogram of LBP, intensity, and shape,”, IEEE Transactions on Information Forensics and Security, vol. 8, no. 3, pp. 488-499, 2013.
    [20]
    T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 28, no. 12, pp. 2037-2041, 2006.
    [21]
    C. Cortes, and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
    [22]
    T.F. Cootes, G.J. Edwards, and C.J. Taylor, “Active appearance models,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 23, no. 6, pp. 681-685, 2001.
    [23]
    M.Y. El Dib, and M. El-Saban, “Human age estimation using enhanced bio-inspired features (EBIF),” in 2010 IEEE International Conference on Image Processing (ICIP 2010), pp. 1589-1592, 2010.
    [24]
    T.F. Cootes, C.J. Taylor, D.H. Cooper, and J. Graham, “Active shape models-their training and application,” Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38-59, 1995.
    [25]
    D. Dunn, and W.E. Higgins, “Optimal gabor filters for texture segmentation,” IEEE Transactions on Image Processing, vol. 4, no. 7, pp. 947-964, 1995.
    [26]
    T. Jabid, M.H. Kabir, and O. Chae, “Local directional pattern (LDP) for face recognition,” in 2010 Digest of Technical Papers International Conference on Consumer Electronics (ICCE 2010), pp. 329-330, 2010.
    [27]
    P. Rauss, H. Moon, S. Rizvi and P. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090-1104, 2000.
    [28]
    Y. Fu, T. Hospedales, T. Xiang, Y. Yao and S. Gong, “Interestingness Prediction by Robust Learning to Rank,” in 13th European Conference on Computer Vision (ECCV 2014), pp. 488-503, 2014.
    [29]
    K. Ricanek, T. Tesafaye, “Morph: A longitudinal image database of normal adult age-progression,” in 7th International Conference on Automatic Face and Gesture Recognition (FGR 2006), pp. 341-345, 2006.
    [30]
    S. Lapuschkin, A. Binder, K.R. Muller, and W. Samek, “Understanding and comparing deep neural networks for age and gender classification,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017), pp. 1629-1638, 2017.
    [31]
    Y. Jia, E. Shellhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” in Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675-678, 2014.
    [32]
    C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Auguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 1-9, 2015.
    [33]
    J. Chen, A. Kumar, R. Ranjan, V. M. Patel, A. Alavi and R. Chellappa, "A cascaded convolutional neural network for age estimation of unconstrained faces," in 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS 2016), pp. 1-8, 2016.
    [34]
    J. Wolfshaar, M. F. Karaaba and M. A. Wiering, "Deep convolutional neural networks and support vector machines for gender recognition," in 2015 IEEE Symposium Series on Computational Intelligence (SSCI 2015), pp. 188-195, 2015.
    [35]
    V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, “Recurrent models of visual attention,” Advances in Neural Information Processing Systems(NIPS 2014), pp. 2204-2212, 2014.
    [36]
    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, “Generative adversarial nets,” Advances in Neural Information Processing Systems(NIPS 2014), pp. 2672-2680, 2014.
    [37]
    T. DeVries, and G. Taylor, “Improved regularization of convolutional neural networks with cutout,” arXiv preprint arXiv:1708.04552, 2017.
    [38]
    K. Zhang, Z. Zhang, Z. Li and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
    [39]
    G. Bradski, “The OpenCV library,” Dr. Dobb's Journal of Software Tools, 2000.
    [40]
    V. Nair, and G.E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th International Conference on Machine Learning (ICML 2010), 2010.
    [41]
    S. Ioffe, and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pp. 448-456, 2015.
    [42]
    Flickr: https://www.flickr.com/, [Online], last accessed July 2020.
    [43]
    N. Qian, “On the momentum term in gradient descent learning algorithms,” Neural networks, vol. 12, no. 1, pp. 145-151, 1999.
    [44] X. Glorot, and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 249-256, 2010.
    [45] A. Krizhevsky, I. Sutskevera and G. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems(NIPS 2012), pp. 1097-1105, 2012.
    [46] A. Gallagher, T. Chen, “Understanding Images of Groups of People,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 256-263, 2009.

    無法下載圖示 全文公開日期 2025/08/05 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE