研究生: |
何岡秩 Kang-Chi Ho |
---|---|
論文名稱: |
基於Wasserstein生成對抗網絡進行可解析之臉部特徵學習 Transformation of Identity-Preserved Facial Features using Wasserstein Generative Adversarial Network with Gradient Penalty |
指導教授: |
徐繼聖
Gee-Sern Hsu |
口試委員: |
洪一平
Yi-Ping Hung 王鈺強 Yu-Chiang Wang 郭景明 Jing-Ming Guo 林惠勇 Huei-Yung Lin |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2018 |
畢業學年度: | 106 |
語文別: | 中文 |
論文頁數: | 60 |
中文關鍵詞: | 生成對抗網路 、人臉辨識 、人臉正面化 |
外文關鍵詞: | Generative Adversarial Network, Face Recognition, Facial Frontalization |
相關次數: | 點閱:538 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我們提出透過Wasserstein損失函數的生成對抗網路輔助擷取人臉屬性特徵(Disentangle Represenetation of Generative Adversarial Network)進行跨角度人臉辨識(Cross Pose Face Recognition),DR-WGAN藉由Wasserstein-1距離以及梯度懲罰的訓練機制取代DRGAN中的Jensen-Shannon (JS) divergence對DRGAN的訓練穩定度以及影像生成的品質上進行效能的提升,由於考量到Wasserstein-1距離以及梯度懲罰的訓練,因此整個生成與對抗的網絡需要重新設計與改良,在研究與實驗的流程中,我們觀察不同設置上的改變對於網路效能上的影響,包括1)資料正規化、2)激活函數、3)訓練資料的增量,並於論文中重點說明網路重新設計上需要考量的問題,我們探討兩種不同的網路設計,其一由生成網路(G)與對抗網路(D)所組成而第二項則基於的一項加入一分類網路(C),實驗中DR-WGAN於MPIE標準資料庫中擊敗DRGAN與其他優秀的演算法,在CFP實驗中證實加入了額外的分類器於DR-WGAN中相較於單純由G與D所組合的網路有更好的效能提升。
We propose the Disentangled Representation Learning on a Wasserstein Generative Adversarial Network with Gradient Penalty, or abbreviated as the DR-WGAN, for handling cross-pose face recognition. The proposed DR-WGAN has improved the state-of-the-art DR-GAN (Disentangled Representation Learning GAN) in the quest for the disentangled facial representation good for cross-pose recognition. The DR-WGAN considers the Wasserstein-1 distance and gradient penalty in the design of the discriminator, instead of the Jensen-Shannon (JS) divergence considered in the DR-GAN, substantially improves the training stability, and in turn the generated image quality. As the Wasserstein-1 distance and gradient penalty considered in the discriminator, the overall generative and adversarial framework needs to be redesigned. We have studied the effects of different approaches for data normalization, activation functions, and data augmentation, and highlight the issues to be considered in the redesigned framework. Two structures of the redesigned frameworks are studied, one with a generator and a discriminator, and the other with an additional classifier. Experiments on the MPIE database show that the DR-WGAN outperforms the DR-GAN and other state-of-the-art approaches. Experiments on the CFP database shows that the framework with an additional classifier outperforms the one without.
[1] Baldi, Pierre. "Autoencoders, unsupervised learning, and deep architectures." Proceedings of ICML Workshop on Unsupervised and Transfer Learning. 2012.
[2] Makhzani, Alireza, et al. "Adversarial autoencoders." arXiv preprint
arXiv:1511.05644 (2015).
[3] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014
[4] Arjovsky, Martin, and Léon Bottou. "Towards principled methods for training generative adversarial networks." arXiv preprint arXiv:1701.04862 (2017).
[5] Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint arXiv:1701.07875 (2017).
[6] Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." arXiv preprint arXiv:1704.00028 (2017).
[7] Salimans, Tim, et al. "Improved techniques for training gans." Advances in Neural Information Processing Systems. 2016.
[8] Tran, Luan, Xi Yin, and Xiaoming Liu. "Disentangled representation learning gan for pose-invariant face recognition." CVPR. Vol. 3. No. 6. 2017.un, Y., Wang, X., & Tang, X. (2015).
[9] Tran, Luan, Xi Yin, and Xiaoming Liu. "Representation learning by rotating your faces." arXiv preprint arXiv:1705.11136 (2017).
[10] Kwak, Hanock, and Byoung-Tak Zhang. "Ways of conditioning generative adversarial networks." arXiv preprint arXiv:1611.01455 (2016).
[11] Zhu, Xiangyu, et al. "High-fidelity pose and expression normalization for face recognition in the wild." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
[12] Medsker, L. R., and L. C. Jain. "Recurrent neural networks." Design and Applications 5 (2001).
[13] Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).
[14] Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." arXiv preprint arXiv:1502.03167 (2015).
[15] Gross, Ralph, et al. "Multi-pie." Image and Vision Computing28.5 (2010): 807-813.
[16] Zhu, Xiangyu, et al. "Face alignment across large poses: A 3d solution." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[17] Blanz, Volker, and Thomas Vetter. "Face recognition based on fitting a 3D morphable model." IEEE Transactions on pattern analysis and machine intelligence 25.9 (2003): 1063-1074.
[18] Romdhani, Sami, and Thomas Vetter. "Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 2. IEEE, 2005.
[19] Paysan, Pascal, et al. "A 3D face model for pose and illumination invariant face recognition." Advanced video and signal based surveillance, 2009. AVSS'09. Sixth IEEE International Conference on. Ieee, 2009.
[20] Cao, Chen, et al. "Facewarehouse: A 3d facial expression database for visual computing." IEEE Transactions on Visualization and Computer Graphics 20.3 (2014): 413-425.
[21] Hsu, Gee-Sern, and Cheng-Hua Hsieh. "Cross-pose landmark localization using multi-dropout framework." Biometrics (IJCB), 2017 IEEE International Joint Conference on. IEEE, 2017.
[22] Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[23] Yi, Dong, et al. "Learning face representation from scratch." arXiv preprint arXiv:1411.7923 (2014).
[24] Liu, Ziwei, et al. "Deep learning face attributes in the wild." Proceedings of the IEEE International Conference on Computer Vision. 2015.
[25] Sengupta, Soumyadip, et al. "Frontal to profile face verification in the wild." Applications of Computer Vision (WACV), 2016 IEEE Winter Conference on. IEEE, 2016.
[26] Klare, Brendan F., et al. "Pushing the frontiers of unconstrained face detection and recognition: IARPA Janus Benchmark A." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
[27] Koestinger, Martin, et al. "Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization." Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on. IEEE, 2011.
[28] Zhu, Zhenyao, et al. "Deep learning identity-preserving face space." Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, 2013.
[29] Zhu, Zhenyao, et al. "Multi-view perceptron: a deep model for learning face identity and view representations." Advances in Neural Information Processing Systems. 2014.
[30] Yim, Junho, et al. "Rotating your face using multi-task deep neural network." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
[31] Wei, Xiang, et al. "Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect." arXiv preprint arXiv:1803.01541 (2018).
[32] Klambauer, Günter, et al. "Self-normalizing neural networks." Advances in Neural Information Processing Systems. 2017.
[33] Wei, Xiang, et al. "Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect." arXiv preprint arXiv:1803.01541 (2018).
[34] Ren, Mengye, et al. "Normalizing the normalizers: Comparing and extending network normalization schemes." arXiv preprint arXiv:1611.04520 (2016).
[35] Li, Chongxuan, et al. "Triple generative adversarial nets." arXiv preprint arXiv:1703.02291 (2017).