簡易檢索 / 詳目顯示

研究生: 許家源
Chia-Yuan Hsu
論文名稱: 基於隨機影像遮蔽式資料擴增之臉部年齡及性別辨識
Age and Gender Recognition with Random Occluded Data Augmentation on Facial Images
指導教授: 林昌鴻
Chang-Hong Lin
口試委員: 陳維美
Wei-Mei Chen
陳郁堂
Yie-Tarng Chen
林敬舜
Ching-Shun Lin
林昌鴻
Chang-Hong Lin
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 49
中文關鍵詞: 年齡預測性別預測深度學習卷積神經網路 (CNN)資料擴增
外文關鍵詞: Gender Classification, Age Classification, Deep Learning, Convolution Neural Networks (CNNs), Data Augmentation
相關次數: 點閱:352下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

基於人臉分析廣泛的應用,如監控、商業用途、人機互動、及娛樂用途等等,各類人臉分析的實驗一直是廣受熱門的研究項目。由於近日深度學習在各領域的活躍,人臉分析的研究也能夠朝著更加困難、實際的使用情境發展。儘管如年齡及性別估測的確靠著深度學習網路的應用,較以往傳統的機器學習方式大幅提升了準確度,卻仍然在實際的應用方面存在著龐大的進步空間。因此,本論文在深度學習的使用上,提出了一項資料擴增的方法,針對訓練資料進行處理,使其模擬了現實應用中將遇到的難題,為深度學習網路提供了更多樣的訓練資料,並藉此提升模型預測之準確性,減少在有限資料的情況下產生之過度擬合的問題,強化網路的強健性及泛化程度。本方法使用了三種簡單的影像處理技巧對輸入的臉部圖片進行隨機遮蔽,此三種技巧分別為遮罩 (Blackout)、隨機亮度 (Random Brightness)、及模糊 (Blur),分別對應了不同種非理想狀況下所產生的挑戰。為了驗證本方法的有效性,我們將提出的資料擴增方法實現在兩種不同的卷積神經網路(Convolutional Neural Network,簡稱CNN)¬–AdienceNet及VGG16,以進行年齡及性別預測。最終結果顯示出,在年齡預測上,我們的資料擴增方法能夠分別對AdienceNet及VGG16網路提升1.0%及0.8%的準確度;並在性別預測上,分別對AdienceNet及VGG16網路提升1.5%及1.2%的準確度。


Facial analysis tasks have always been hot topics over the years for their broad varieties of applications, such as surveillance, commercial uses, human-machine interaction, and entertainment, etc. With the recent success of deep learning, these facial analysis tasks were able to tackle more the difficult and practical situations. However, while some tasks like age and gender estimation did achieve substantial improvements to the traditional machine learning based methods, they are still far from perfect to satisfy the need of real-life applications. This thesis proposed a data augmentation method by altering the training images that resemble real-life photos to improve the performance of the networks by providing more varieties to the training samples. The proposed method, Random Occlusion, adopted three simple occlusion techniques, Blackout, Random Brightness, and Blur, each simulating a different kind of challenge that would be encountered in real-world applications. We verify the effectiveness of our proposed method by implementing the augmentation method on two convolution neural networks (CNNs), the modified AdienceNet and VGG16 to perform age and gender classification. The proposed augmentation method improves the age accuracy results of the modified the AdienceNet and VGG16 by 1.0% and 0.8%, respectively; and gender accuracy results of the AdienceNet and VGG16 by 1.5% and 1.2%, respectively.

摘要 I ABSTRACT II 致謝 III LIST OF CONTENTS IV LIST OF FIGURES VI LIST OF TABLES VIII CHAPTER 1 INTRODUCTIONS 1 1.1 Motivation 1 1.2 Contributions 2 1.3 Thesis Organization 3 CHAPTER 2 RELATED WORKS 4 2.1 Traditional Age and Gender Estimation 4 2.2 Deep Learning Methods for Age and Gender Estimation 5 2.3 Training Strategies 6 CHAPTER 3 PROPOSED METHOD 8 3.1 Region Selection 8 3.1.1 Region Size 9 3.1.2 Region Location 9 3.2 Blackout 11 3.3 Random Brightness 12 3.4 Blur 14 CHAPTER 4 IMPLEMENTATION 16 4.1 Preprocessing 16 4.1.1 Face Detection 17 4.1.2 Face Alignment 17 4.2 Data Augmentation 18 4.3 CNN 19 4.3.1 Modified AdienceNet [17] 19 4.3.2 VGG16 [19] 21 4.3.3 Training Parameters 22 CHAPTER 5 EXPERIMENTAL RESULTS 25 5.1 Experimental Environment 25 5.2 Adience Database [7] 26 5.3 Prediction 27 5.4 Performance Evaluation 27 5.4.1 Gender Classification 27 5.4.2 Age Classification 28 5.5 Performance Evaluation with Different Settings 29 5.5.1 Region Size γ 29 5.5.2 Blackout Color 30 CHAPTER 6 CONCLUSIONS AND FUTURE WORKS 32 6.1 Conclusions 32 6.2 Future Works 33 REFERENCES 34

[1]
A.K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition.” IEEE Transactions on circuits and systems for video technology, vol. 14, no. 1, pp. 4-20, 2004.
[2]
R. Maldonado, P. Tansuhaj, and D.D. Muehling, “The impact of gender on ad processing: A social identity perspective.” Academy of Marketing Science Review, vol. 3, no. 3, pp. 1-15, 2003.
[3]
K. Luu, K. Ricanek, T.D. Bui, and C.Y. Suen, “Age estimation using active appearance models and support vector machine regression.” 2009 IEEE 3rd International Conference on Biometrics: Theory, Applications, and Systems. IEEE, 2009.
[4]
A. Gunay, and V.V. Nabiyev, “Automatic age classification with LBP.” 2008 23rd International Symposium on Computer and Information Sciences. IEEE, 2008.
[5]
M. Hu, Y. Zheng, F. Ren and H. Jiang, “Age estimation and gender classification of facial images based on Local Directional Pattern.” 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems. IEEE, 2014.
[6]
C. Shan, “Learning local features for age estimation on real-life faces.” Proceedings of the 1st ACM international workshop on Multimodal pervasive video analysis. ACM, 2010.
[7]
E. Eidinger, R. Enbar and T. Hassner, “Age and Gender Estimation of Unfiltered Faces.” IEEE Transactions on Information Forensics and Security, vol. 9, no. 12, pp. 2170-2179, 2014.
[8]
R. Ranjan, S. Sankaranarayanan, C.D. Castillo and R. Chellappa, “An all-in-one convolutional neural network for face analysis.” 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017). IEEE, 2017.
[9]
P. Rodríguez, G. Curcurull, J.M. Gonfaus, F.X. Roca and J. Gonzalez, “Age and gender recognition in the wild with deep attention.” Pattern Recognition, vol. 72 pp. 563-571, 2017.
[10]
R. Rothe, R. Timofte, and L. Van Gool, “Dex: Deep expectation of apparent age from a single image.” Proceedings of the IEEE International Conference on Computer Vision Workshops, 2015.
[11]
G. Ozbulak, Y. Aytar, and H.K. Ekenel, “How transferable are CNN-based features for age and gender classification?.” 2016 International Conference of the Biometrics Special Interest Group (BIOSIG). IEEE, 2016.
[12]
T. van Laarhoven, “L2 regularization versus batch and weight normalization.” arXiv preprint arXiv:1706.05350, 2017.
[13]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting.” The Journal of Machine Learning Research, vol. 15, no.1, pp. 1929-1958, 2014.
[14]
B. Zoph, E.D. Cubuk, G. Ghiasi, T.Y. Lin, J. Shlens, and Q.V. Le, “Learning Data Augmentation Strategies for Object Detection.” arXiv preprint arXiv:1906.11172, 2019.
[15]
J. Wang, and L. Perez, “The effectiveness of data augmentation in image classification using deep learning.” arXiv preprint arXiv:1712.04621, 2017.
[16]
H. Inoue, “Data augmentation by pairing samples for images classification.” arXiv preprint arXiv:1801.02929, 2018.
[17]
G. Levi and T. Hassncer, “Age and gender classification using convolutional neural networks.” Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2015.
[18]
S. Lapuschkin, A. Binder, K.R. Muller, and W. Samek, “Understanding and comparing deep neural networks for age and gender classification.” Proceedings of the IEEE International Conference on Computer Vision, 2017.
[19]
K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition.” arXiv preprint arXiv:1409.1556, 2014.
[20]
T.F. Cootes, G.J. Edwards, and C.J. Taylor, “Active appearance models.” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 23, no. 6, pp. 681-685, 2001.
[21]
C. Cortes, and V. Vapnik, “Support-vector networks.” Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
[22]
Y. Fu, G. Guo, and T.S. Huang, “Age synthesis and estimation via faces: A survey.” IEEE transactions on pattern analysis and machine intelligence, vol. 32, no. 11, pp. 1955-1976, 2010.
[23]
M.Y. El Dib, and M. El-Saban, “Human age estimation using enhanced bio-inspired features (EBIF).” 2010 IEEE International Conference on Image Processing. IEEE, 2010.
[24]
T.F. Cootes, C.J. Taylor, D.H. Cooper, and J. Graham, “Active shape models-their training and application.” Computer vision and image understanding, vol. 61, no. 1, pp. 38-59, 1995.
[25]
T. Jabid, M.H. Kabir, and O. Chae, “Local directional pattern (LDP) for face recognition.” 2010 digest of technical papers international conference on consumer electronics (ICCE). IEEE, 2010.
[26]
Y. Fu. FG-NET dataset: https://yanweifu.github.io/FG_NET_data/index.html, [Online]
[27]
K. Ricanek, T. Tesafaye, “Morph: A longitudinal image database of normal adult age-progression.” 7th International Conference on Automatic Face and Gesture Recognition (FGR06). IEEE, 2006.
[28]
A.C. Gallagher, and T. Chen, “Understanding images of groups of people.” 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009.
[29]
T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition.” IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 28, no. 12, pp. 2037-2041, 2006.
[30]
D. Dunn, and W.E. Higgins, “Optimal Gabor filters for texture segmentation.” IEEE Transactions on image processing, vol. 4, no. 7, pp. 947-964, 1995.
[31]
L. Wolf, T. Hassner, and Y. Taigman, “Descriptor based methods in the wild.” Faces in Real-Life Images workshop at the European Conference on Computer Vision (ECCV), Marseille, 2008.
[32]
Y. Jia, E. Shellhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, “Caffe: Convolutional architecture for fast feature embedding.” Proceedings of the 22nd ACM international conference on Multimedia. ACM, 2014.
[33]
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Auguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, “Going deeper with convolutions.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[34]
S. Hosseini, S.H. Lee, H.J. Kwon, H.I. Koo, and N.I. Cho, “Age and gender classification using wide convolutional neural network and Gabor filter.” 2018 International Workshop on Advanced Image Technology (IWAIT). IEEE, 2018.
[35]
V. Mnih, N. Heess, and A. Graves, “Recurrent models of visual attention.” Advances in neural information processing systems, MIT Press, pp. 2204-2212, 2014.
[36]
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, “Generative adversarial nets.” Advances in neural information processing systems, MIT Press, pp. 2672-2680, 2014.
[37]
T. DeVries, and G.W. Taylor, “Improved regularization of convolutional neural networks with cutout.” arXiv preprint arXiv:1708.04552, 2017.
[38]
K. Zhang, Z. Zhang, Z. Li and Y. Qiao, “Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks.” IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499-1503, 2016.
[39]
OpenCV: https://opencv.org/, [Online].
[40]
S. Ioffe, and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift.” arXiv preprint arXiv:1502.03167, 2015.
[41]
V. Nair, and G.E. Hinton, “Rectified linear units improve restricted boltzmann machines.” Proceedings of the 27th international conference on machine learning (ICML-10), 2010.
[42]
N. Silberman and S. Guadarrama. TensorFlow-Slim image classification model library: https://github.com/tensorflow/models/tree/master/research/slim, [Online], 2016.
[43]
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A.C. Berg and F.F. Li., “ImageNet Large Scale Visual Recognition Challenge.” International journal of computer vision, vol. 115, no. 3, pp. 211-252, 2015.
[44]
N. Qian, “On the momentum term in gradient descent learning algorithms.” Neural networks, vol. 12, no. 1, pp. 145-151, 1999.
[45]
Python: https://www.python.org/, [Online].
[46]
Tensorflow: https://www.tensorflow.org/, [Online].
[47]
Flickr: https://www.flickr.com/, [Online].

無法下載圖示 全文公開日期 2024/07/31 (校內網路)
全文公開日期 2024/07/31 (校外網路)
全文公開日期 2024/07/31 (國家圖書館:臺灣博碩士論文系統)
QR CODE