研究生: |
謝睿滄 Rui-Cang Xie |
---|---|
論文名稱: |
年齡風格生成對抗網路之高解析臉部年齡轉換 Age StyleGAN for High Resolution Facial Age Transformation |
指導教授: |
徐繼聖
Gee-Sern Hsu |
口試委員: |
郭景明
Jing-Ming Gu 鍾聖倫 Sheng-Luen Chung 林嘉文 Chia-Wen Lin 林彥宇 Yen-Yu Lin |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2020 |
畢業學年度: | 108 |
語文別: | 中文 |
論文頁數: | 65 |
中文關鍵詞: | 臉部年齡轉換 、高解析度影像生成 、風格生成對抗網路 |
外文關鍵詞: | Facial age transfomation, High resolution image generation, StyleGAN |
相關次數: | 點閱:315 下載:4 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我們透過最先進的生成對抗網路StyleGAN2其優異的影像生成品質與潛在空間解耦能力,提出了新的網路架構,年齡風格生成對抗網路(Age StyleGAN, ASGAN),用於臉部年齡轉換的特定任務中,能將輸入影像直接映射至目標年齡分布,合成相應年齡特徵的高解析度臉部影像。我們基於StyleGAN2生成器提出了新的生成器架構,此生成器是由編碼器與解碼器組成,所設計之編碼器會將輸入影像與年齡標籤編碼成身分潛向量(Identity Latent Vector)與風格潛向量(Style Latent Vector),解碼器由身分潛向量開始解碼,在生成影像的過程會將風格潛向量嵌入到隱藏層中,生成特定年齡風格並保留身分特徵的影像。判別器使用了預訓練的年齡預測模型抽取不同隱藏層的年齡特徵,作為金字塔架構判別器的輸入,通過判別不同尺度的年齡特徵,可以使生成影像達到更細緻的年齡表現,此外也在判別器中加入了條件投影(Conditional Projection)機制,將年齡標籤與判別器特徵結合,故判別器同時也會監督生成影像是否有在正確的年齡區間。在與其他方法比較,我們的方法在MORPH與CACD資料庫展現了優異的競爭力,並在FFHQ-Aging達到了高解析度年齡轉換的效果。
The StyleGAN2 is a state-of-the-art network for latent space disentanglement and high resolution image generation. Built upon the StyleGAN2, our proposed Age StyleGAN (ASGAN) can smoothly transfer the age of a given face. The novelties of the ASGAN include the following: 1) We design an encoder to encode a face and a desired age label into a pair of corresponding identity and style latent codes. The pair of identity and style latent codes are entered to the StyleGAN2-based decoder for making a face with the desired age and preserved identity. 2) We design the pyramid-structured and conditional-projection (PSCP) discriminator for extracting the discriminative multi-scaled age features to improve the photo-realistic quality and accurate age traits of the generated faces. Experiments on the MORPH and CACD databases show that the proposed ASGAN is highly competitive to other state-of-the-art approaches, and achieves high-resolution age transformation on the FFHQ-Aging database.
[1] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
[2] Karras, Tero, et al. "Progressive growing of gans for improved quality, stability, and variation." arXiv preprint arXiv:1710.10196 (2017).
[3] Karras, Tero, Samuli Laine, and Timo Aila. "A style-based generator architecture for generative adversarial networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
[4] Huang, Xun, and Serge Belongie. "Arbitrary style transfer in real-time with adaptive instance normalization." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[5] Karras, Tero, et al. "Analyzing and improving the image quality of stylegan." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[6] Shen, Yujun, et al. "Interpreting the latent space of gans for semantic face editing." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[7] Yang, Hongyu, et al. "Learning face age progression: A pyramid architecture of gans." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[8] Yang, Hongyu, et al. "Learning continuous face age progression: A pyramid of gans." IEEE transactions on pattern analysis and machine intelligence (2019).
[9] Miyato, Takeru, and Masanori Koyama. "cGANs with projection discriminator." arXiv preprint arXiv:1802.05637 (2018).
[10] Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." Advances in neural information processing systems. 2017.
[11] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[12] Karnewar, Animesh, and Raghu Sesha Iyengar. "MSG-GAN: Multi-Scale Gradients GAN for more stable and synchronized multi-scale image synthesis." arXiv preprint arXiv:1903.06048 (2019).
[13] Mao, Xudong, et al. "Least squares generative adversarial networks." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[14] Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep face recognition." (2015).
[15] Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint arXiv:1411.1784 (2014).
[16] LeCun, Yann, Corinna Cortes, and C. J. Burges. "MNIST handwritten digit database." (2010): 18.
[17] Reed, Scott, et al. "Generative adversarial text to image synthesis." arXiv preprint arXiv:1605.05396 (2016).
[18] Zhang, Han, et al. "Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017.
[19] Kim, Taeksoo, et al. "Learning to discover cross-domain relations with generative adversarial networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
[20] Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017.
[21] Perarnau, Guim, et al. "Invertible conditional gans for image editing." arXiv preprint arXiv:1611.06355 (2016).
[22] Odena, Augustus, Christopher Olah, and Jonathon Shlens. "Conditional image synthesis with auxiliary classifier gans." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
[23] Deng, Jiankang, et al. "Arcface: Additive angular margin loss for deep face recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
[24] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[25] Ricanek, Karl, and Tamirat Tesafaye. "Morph: A longitudinal image database of normal adult age-progression." 7th International Conference on Automatic Face and Gesture Recognition (FGR06). IEEE, 2006.
[26] Chen, Bor-Chun, Chu-Song Chen, and Winston H. Hsu. "Cross-age reference coding for age-invariant face recognition and retrieval." European conference on computer vision. Springer, Cham, 2014.
[27] Or-El, Roy, et al. "Lifespan Age Transformation Synthesis." arXiv preprint arXiv:2003.09764 (2020).
[28] Figure-Eight API. Figure Eight Inc. https://appen.com/
[29] Face++ API. Megvii Inc. http://www.faceplusplus.com
[30] Bulat, Adrian, and Georgios Tzimiropoulos. "How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks)." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[31] Li, Peipei, et al. "Global and local consistent age generative adversarial networks." 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2018.
[32] Liu, Yunfan, Qi Li, and Zhenan Sun. "Attribute-aware face aging with wavelet-based generative adversarial networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
[33] Choi, Yunjey, et al. "Stargan: Unified generative adversarial networks for multi-domain image-to-image translation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[34] Zhang, Zhifei, Yang Song, and Hairong Qi. "Age progression/regression by conditional adversarial autoencoder." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[35] Face of the future. Computer Science Dept. at Aberystwyth University, https://cherry.dcs.aber.ac.uk/Transformer/
[36] AgingBooth. PiVi & Co. https://apps.apple.com/us/app/agingbooth/id357467791
[37] FaceApp. FaceApp Inc. https://faceapp.com/app
[38] Viazovetskyi, Yuri, Vladimir Ivashkin, and Evgeny Kashin. "Stylegan2 distillation for feed-forward image manipulation." arXiv preprint arXiv:2003.03581 (2020).
[39] Cao, Qiong, et al. "Vggface2: A dataset for recognising faces across pose and age." 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, 2018.
[40] Todd, James T., et al. "The perception of human growth." Scientific american 242.2 (1980): 132-145.
[41] Lanitis, Andreas, Christopher J. Taylor, and Timothy F. Cootes. "Toward automatic simulation of aging effects on face images." IEEE Transactions on pattern Analysis and machine Intelligence 24.4 (2002): 442-455.
[42] Tiddeman, Bernard, Michael Burt, and David Perrett. "Prototyping and transforming facial textures for perception research." IEEE computer graphics and applications 21.5 (2001): 42-50.
[43] Kemelmacher-Shlizerman, Ira, Supasorn Suwajanakorn, and Steven M. Seitz. "Illumination-aware age progression." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
[44] Denton, Emily L., Soumith Chintala, and Rob Fergus. "Deep generative image models using a laplacian pyramid of adversarial networks." Advances in neural information processing systems. 2015.
[45] Nikitko, D.: Stylegan encoder for official tensorflow implementation. https:// github.com/Puzer/stylegan-encoder. 2019.