Basic Search / Detailed Display

Author: 林律恩
Lu-En Lin
Thesis Title: 基於臉部特徵遮蔽式資料擴增之年齡及性別估測
Data Augmentation with Occluded Facial Features for Age and Gender Estimation
Advisor: 林昌鴻
Chang-Hong Lin
Committee: 林昌鴻
Chang-Hong Lin
陳維美
Wei-Mei Chen
林敬舜
Ching-Shun Lin
王煥宗
Huan-Chun Wang
陳永耀
Yung-Yao Chen
Degree: 碩士
Master
Department: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
Thesis Publication Year: 2020
Graduation Academic Year: 108
Language: 英文
Pages: 67
Keywords (in Chinese): 年齡預測性別預測深度學習卷積神經網路資料擴增
Keywords (in other languages): Gender Classification, Age Classification, Deep Learning, Convolution Neural Networks, Data Augmentation
Reference times: Clicks: 529Downloads: 0
Share:
School Collection Retrieve National Library Collection Retrieve Error Report

基於人臉分析廣泛的應用一直是熱門的研究項目。商業用途、娛樂用途、監控、及人機互動等等,皆需透過人臉分析來提供必要的資訊。由於近日深度學習在各領域的崛起,人臉分析的研究能夠朝著更加複雜和實際的使用情境發展。在年齡和性別的估測中,利用深度學習網路的應用,比較起以往機器學習的方式大幅提升了準確度。但年齡和性別估測在實際生活中的應用還有不少的進步空間。因此,本論文提出了一項資料擴增的方法,將深度網路的訓練資料進行處理,在臉部特徵上模擬現實生活中會遇到的難題,為深度網路提供更多的訓練資料,並提升深度網路的強健性和泛化程度。本方法利用了三種簡單的影像處理技巧來對輸入圖片的臉部特徵進行遮蔽,這三種技巧分別為遮罩(Blackout)、隨機亮度(Random Brightness)、及模糊(Blur)。此論文也提出一個稍微修改的交叉熵損失函數(Cross-entropy loss)。當年齡估測的結果和標準答案相差為一時,將給予年齡的損失函數較少的懲罰。為了驗證本方法的有效性,我們將提出的資料擴增方法實現在兩種不同的卷積神經網路(Convolutional Neural Network)¬–稍微修改的AdienceNet及稍微修改的VGG16,以進行年齡及性別預測。最終結果顯示出,在年齡預測上,我們的資料擴增方法及稍微修改的交叉熵損失函數能夠分別對稍微修改的AdienceNet及稍微修改的VGG16網路提升6.62%及6.53%的準確度;並在性別預測上,分別提升6.20%及6.31%的準確度。


Facial analysis tasks have been a very hot topic over the years for its broad varieties of applications, such as human-machine interaction, commercial uses, entertainment, and surveillance, etc. With the rise of deep learning, these tasks were able to solve more difficult and practical problems. Even though the task of age and gender estimation with deep learning methods achieved considerable improvements compared to the traditional machine learning based methods, the results are still far from satisfying the need for real-life applications. In this thesis, a data augmentation method that stimulates real-life challenges on the main feature of the human face is proposed. With the proposed method, we improve the generalization and robustness of the network by generating more variety of training samples. The proposed method, Feature occlusion, used three simple occlusion techniques, Blackout, Random Brightness, and Blur to stimulate different challenges that could happen in real-life situations. We also proposed a modified cross-entropy loss that gives less penalty to the age predictions that lands on the adjacent classes of the ground truth class. We verify the effectiveness of our proposed method by implementing the augmentation method and modified cross-entropy loss on two different convolution neural networks (CNNs), the slightly modified AdienceNet and the slightly modified VGG16, to perform age and gender classification. The proposed augmentation system improves age and gender classification accuracy of the slightly modified AdienceNet network by 6.62% and 6.53% on the Adience dataset, respectively. The proposed augmentation system also improves the age and gender classification accuracy of the slightly modified VGG16 network by 6.20% and 6.31% on the Adience dataset, respectively.

摘要 I ABSTRACT II LIST OF CONTENTS III LIST OF FIGURES V LIST OF TABLES VII CHAPTER 1 INTRODUCTIONS 1 1.1 Motivation 1 1.2 Contributions 3 1.3 Thesis Organization 4 CHAPTER 2 RELATED WORKS 5 2.1 Traditional Age and Gender Estimation 5 2.2 Deep Learning Methods for Age and Gender Estimation 6 2.3 Training Strategies 7 CHAPTER 3 PROPOSED METHOD 9 3.1 Feature Detection 10 3.1.1 Face detection 10 3.1.2 Face alignment 11 3.2 Feature Occlusion 14 3.2.1 Region Selection 14 3.2.2 Random Brightness 17 3.2.3 Blackout 19 3.2.4 Blur 20 3.3 Data Augmentation 22 3.4 CNN 23 3.4.1 Modified AdienceNet [17] 23 3.4.2 Modified VGG16 [18] 25 3.5 Training Details 27 3.5.1 Initialization 28 3.5.2 Loss Function 28 3.5.3 Optimizer 30 3.5.4 Training Method 30 CHAPTER 4 EXPERIMENTAL RESULTS 31 4.1 Experimental Environment 31 4.2 Adience Dataset [7] 32 4.3 Performance Evaluation 35 4.3.1 Age and Gender Classification 35 4.3.1.1 Modified Cross-Entropy Loss 35 4.3.1.2 Feature Occlusion 37 4.3.2 Gender Classification 41 4.3.3 Age Classification 42 4.4 Mask Evaluation 46 CHAPTER 5 CONCLUSIONS AND FUTURE WORKS 48 5.1 Conclusions 48 5.2 Future Works 50 REFERENCES 51

[1]
A.K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 14, no. 1, pp. 4-20, 2004.
[2]
R. Maldonado, P. Tansuhaj, and D.D. Muehling, “The impact of gender on ad processing: A social identity perspective,” Academy of Marketing Science Review, vol. 3, no. 3, pp. 1-15, 2003.
[3]
K. Luu, K. Ricanek, T.D. Bui, and C.Y. Suen, “Age estimation using active appearance models and support vector machine regression,” in 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems (BTAS 2009). IEEE, pp. 1-5, 2009.
[4]
A. Gunay, and V.V. Nabiyev, “Automatic age classification with LBP,” in 2008 23rd International Symposium on Computer and Information Sciences (ISCIS 2008). IEEE, pp. 1-4, 2008.
[5]
M. Hu, Y. Zheng, F. Ren and H. Jiang, “Age estimation and gender classification of facial images based on Local Directional Pattern,” in 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems (ICCCIS 2014). IEEE, pp. 103-107, 2014.
[6]
C. Shan, “Learning local features for age estimation on real-life faces,” in Proceedings of the 1st ACM International Workshop on Multimodal Pervasive Video Analysis (MPVA 2010), pp.23-28, 2010.
[7]
E. Eidinger, R. Enbar and T. Hassner, “Age and gender estimation of unfiltered faces,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 12, pp. 2170-2179, 2014.
[8]
R. Ranjan, S. Sankaranarayanan, C.D. Castillo and R. Chellappa, “An all-in-one convolutional neural network for face analysis,” in 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE, pp. 17-24, 2017.
[9]
P. Rodríguez, G. Curcurull, J.M. Gonfaus, F.X. Roca and J. Gonzalez, “Age and gender recognition in the wild with deep attention,” in Pattern Recognition, vol. 72 pp. 563-571, 2017.
[10]
R. Rothe, R. Timofte, and L. V. Gool, “Dex: Deep expectation of apparent age from a single image,” in Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCV 2015), pp. 10-15, 2015.
[11]
G. Ozbulak, Y. Aytar, and H.K. Ekenel, “How transferable are CNN-based features for age and gender classification?,” in 2016 International Conference of the Biometrics Special Interest Group (BIOSIG 2016). IEEE, pp. 1-6, 2016.
[12]
E. Hoffer, R. Banner, I. Golan, and D. Soudry, “Norm matters: efficient and accurate normalization schemes in deep networks,” Advances in Neural Information Processing Systems (NIPS 2018), pp. 2160-2170, 2018.
[13]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no.1, pp. 1929-1958, 2014.
[14]
B. Zoph, E.D. Cubuk, G. Ghiasi, T.Y. Lin, J. Shlens, and Q.V. Le, “Learning data augmentation strategies for object detection,” arXiv preprint arXiv:1906.11172, 2019.
[15]
J. Wang and L. Perez, “The effectiveness of data augmentation in image classification using deep learning,” arXiv preprint arXiv:1712.04621, 2017.
[16]
J. Lemley, S. Bazrafkan, and P. Corcoran, “Smart Augmentation Learning an Optimal Data Augmentation Strategy,” in IEEE Access, vol. 5, pp. 5858-5869, 2017.
[17]
G. Levi and T. Hassncer, “Age and gender classification using convolutional neural networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition workshops (CVPR 2015), pp. 2622-2629, 2015.
[18]
K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.
[19]
J. Tapia, and C. Perez, “Gender classification based on fusion of different spatial scale features selected by mutual information from histogram of LBP, intensity, and shape,”, IEEE Transactions on Information Forensics and Security, vol. 8, no. 3, pp. 488-499, 2013.
[20]
T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 28, no. 12, pp. 2037-2041, 2006.
[21]
C. Cortes, and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
[22]
T.F. Cootes, G.J. Edwards, and C.J. Taylor, “Active appearance models,” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 23, no. 6, pp. 681-685, 2001.
[23]
M.Y. El Dib, and M. El-Saban, “Human age estimation using enhanced bio-inspired features (EBIF),” in 2010 IEEE International Conference on Image Processing (ICIP 2010), pp. 1589-1592, 2010.
[24]
T.F. Cootes, C.J. Taylor, D.H. Cooper, and J. Graham, “Active shape models-their training and application,” Computer Vision and Image Understanding, vol. 61, no. 1, pp. 38-59, 1995.
[25]
D. Dunn, and W.E. Higgins, “Optimal gabor filters for texture segmentation,” IEEE Transactions on Image Processing, vol. 4, no. 7, pp. 947-964, 1995.
[26]
T. Jabid, M.H. Kabir, and O. Chae, “Local directional pattern (LDP) for face recognition,” in 2010 Digest of Technical Papers International Conference on Consumer Electronics (ICCE 2010), pp. 329-330, 2010.
[27]
P. Rauss, H. Moon, S. Rizvi and P. Rauss, “The FERET evaluation methodology for face-recognition algorithms,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090-1104, 2000.
[28]
Y. Fu, T. Hospedales, T. Xiang, Y. Yao and S. Gong, “Interestingness Prediction by Robust Learning to Rank,” in 13th European Conference on Computer Vision (ECCV 2014), pp. 488-503, 2014.
[29]
K. Ricanek, T. Tesafaye, “Morph: A longitudinal image database of normal adult age-progression,” in 7th International Conference on Automatic Face and Gesture Recognition (FGR 2006), pp. 341-345, 2006.
[30]
S. Lapuschkin, A. Binder, K.R. Muller, and W. Samek, “Understanding and comparing deep neural networks for age and gender classification,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV 2017), pp. 1629-1638, 2017.
[31]
Y. Jia, E. Shellhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding,” in Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675-678, 2014.
[32]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Auguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2015), pp. 1-9, 2015.
[33]
J. Chen, A. Kumar, R. Ranjan, V. M. Patel, A. Alavi and R. Chellappa, "A cascaded convolutional neural network for age estimation of unconstrained faces," in 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS 2016), pp. 1-8, 2016.
[34]
J. Wolfshaar, M. F. Karaaba and M. A. Wiering, "Deep convolutional neural networks and support vector machines for gender recognition," in 2015 IEEE Symposium Series on Computational Intelligence (SSCI 2015), pp. 188-195, 2015.
[35]
V. Mnih, N. Heess, A. Graves, and K. Kavukcuoglu, “Recurrent models of visual attention,” Advances in Neural Information Processing Systems(NIPS 2014), pp. 2204-2212, 2014.
[36]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, “Generative adversarial nets,” Advances in Neural Information Processing Systems(NIPS 2014), pp. 2672-2680, 2014.
[37]
T. DeVries, and G. Taylor, “Improved regularization of convolutional neural networks with cutout,” arXiv preprint arXiv:1708.04552, 2017.
[38]
K. Zhang, Z. Zhang, Z. Li and Y. Qiao, “Joint face detection and alignment using multitask cascaded convolutional networks,” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
[39]
G. Bradski, “The OpenCV library,” Dr. Dobb's Journal of Software Tools, 2000.
[40]
V. Nair, and G.E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th International Conference on Machine Learning (ICML 2010), 2010.
[41]
S. Ioffe, and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proceedings of the 32nd International Conference on Machine Learning (ICML 2015), pp. 448-456, 2015.
[42]
Flickr: https://www.flickr.com/, [Online], last accessed July 2020.
[43]
N. Qian, “On the momentum term in gradient descent learning algorithms,” Neural networks, vol. 12, no. 1, pp. 145-151, 1999.
[44] X. Glorot, and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (AISTATS 2010), pp. 249-256, 2010.
[45] A. Krizhevsky, I. Sutskevera and G. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in Neural Information Processing Systems(NIPS 2012), pp. 1097-1105, 2012.
[46] A. Gallagher, T. Chen, “Understanding Images of Groups of People,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2009), pp. 256-263, 2009.

無法下載圖示 Full text public date 2025/08/05 (Intranet public)
Full text public date This full text is not authorized to be published. (Internet public)
Full text public date This full text is not authorized to be published. (National library)
QR CODE