簡易檢索 / 詳目顯示

研究生: 謝睿滄
Rui-Cang Xie
論文名稱: 年齡風格生成對抗網路之高解析臉部年齡轉換
Age StyleGAN for High Resolution Facial Age Transformation
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 郭景明
Jing-Ming Gu
鍾聖倫
Sheng-Luen Chung
林嘉文
Chia-Wen Lin
林彥宇
Yen-Yu Lin
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 65
中文關鍵詞: 臉部年齡轉換高解析度影像生成風格生成對抗網路
外文關鍵詞: Facial age transfomation, High resolution image generation, StyleGAN
相關次數: 點閱:315下載:4
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

我們透過最先進的生成對抗網路StyleGAN2其優異的影像生成品質與潛在空間解耦能力,提出了新的網路架構,年齡風格生成對抗網路(Age StyleGAN, ASGAN),用於臉部年齡轉換的特定任務中,能將輸入影像直接映射至目標年齡分布,合成相應年齡特徵的高解析度臉部影像。我們基於StyleGAN2生成器提出了新的生成器架構,此生成器是由編碼器與解碼器組成,所設計之編碼器會將輸入影像與年齡標籤編碼成身分潛向量(Identity Latent Vector)與風格潛向量(Style Latent Vector),解碼器由身分潛向量開始解碼,在生成影像的過程會將風格潛向量嵌入到隱藏層中,生成特定年齡風格並保留身分特徵的影像。判別器使用了預訓練的年齡預測模型抽取不同隱藏層的年齡特徵,作為金字塔架構判別器的輸入,通過判別不同尺度的年齡特徵,可以使生成影像達到更細緻的年齡表現,此外也在判別器中加入了條件投影(Conditional Projection)機制,將年齡標籤與判別器特徵結合,故判別器同時也會監督生成影像是否有在正確的年齡區間。在與其他方法比較,我們的方法在MORPH與CACD資料庫展現了優異的競爭力,並在FFHQ-Aging達到了高解析度年齡轉換的效果。


The StyleGAN2 is a state-of-the-art network for latent space disentanglement and high resolution image generation. Built upon the StyleGAN2, our proposed Age StyleGAN (ASGAN) can smoothly transfer the age of a given face. The novelties of the ASGAN include the following: 1) We design an encoder to encode a face and a desired age label into a pair of corresponding identity and style latent codes. The pair of identity and style latent codes are entered to the StyleGAN2-based decoder for making a face with the desired age and preserved identity. 2) We design the pyramid-structured and conditional-projection (PSCP) discriminator for extracting the discriminative multi-scaled age features to improve the photo-realistic quality and accurate age traits of the generated faces. Experiments on the MORPH and CACD databases show that the proposed ASGAN is highly competitive to other state-of-the-art approaches, and achieves high-resolution age transformation on the FFHQ-Aging database.

摘要 II Abstract III 誌謝 IV 目錄 V 圖目錄 VII 表目錄 IX 第一章 介紹 1 1.1 研究背景和動機 1 1.2 方法概述 3 1.3 論文貢獻 4 1.4 論文架構 5 第二章 文獻回顧 6 2.1 Generative Adversarial Nets 6 2.2 Progressive Growing of GANs 7 2.3 A Style-based GAN Architecture 8 2.4 Analyzing and Improving the Image Quality of StyleGAN 10 2.5 Learning Continuous Face Age Progression: A Pyramid of GANs 13 2.6 cGANs with Projection Discriminator 15 第三章 主要方法 17 3.1 生成器設計 17 3.2 判別器設計 19 3.3 整體網路架構 20 第四章 實驗設置與分析 23 4.1資料庫介紹 23 4.1.1 MORPH Database 23 4.1.2 Cross-Age Celebrity Database 24 4.1.3 Flickr-Faces-HQ-Aging 25 4.2 實驗設置 26 4.2.1 資料劃分 26 4.2.2 網路架構設置 27 4.2.3 效能評估方式 29 4.3 實驗結果與分析 29 4.3.1 不同生成器設置比較 30 4.3.2 不同判別器設置比較 34 4.4 與相關文獻之效能比較 38 4.4.1 準確面部年齡轉換之比較 38 4.4.2 高解析面部年齡轉換之比較 43 第五章 結論與未來研究方向 50 第六章 參考文獻 52

[1] Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.
[2] Karras, Tero, et al. "Progressive growing of gans for improved quality, stability, and variation." arXiv preprint arXiv:1710.10196 (2017).
[3] Karras, Tero, Samuli Laine, and Timo Aila. "A style-based generator architecture for generative adversarial networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
[4] Huang, Xun, and Serge Belongie. "Arbitrary style transfer in real-time with adaptive instance normalization." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[5] Karras, Tero, et al. "Analyzing and improving the image quality of stylegan." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[6] Shen, Yujun, et al. "Interpreting the latent space of gans for semantic face editing." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[7] Yang, Hongyu, et al. "Learning face age progression: A pyramid architecture of gans." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[8] Yang, Hongyu, et al. "Learning continuous face age progression: A pyramid of gans." IEEE transactions on pattern analysis and machine intelligence (2019).
[9] Miyato, Takeru, and Masanori Koyama. "cGANs with projection discriminator." arXiv preprint arXiv:1802.05637 (2018).
[10] Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." Advances in neural information processing systems. 2017.
[11] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[12] Karnewar, Animesh, and Raghu Sesha Iyengar. "MSG-GAN: Multi-Scale Gradients GAN for more stable and synchronized multi-scale image synthesis." arXiv preprint arXiv:1903.06048 (2019).
[13] Mao, Xudong, et al. "Least squares generative adversarial networks." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[14] Parkhi, Omkar M., Andrea Vedaldi, and Andrew Zisserman. "Deep face recognition." (2015).
[15] Mirza, Mehdi, and Simon Osindero. "Conditional generative adversarial nets." arXiv preprint arXiv:1411.1784 (2014).
[16] LeCun, Yann, Corinna Cortes, and C. J. Burges. "MNIST handwritten digit database." (2010): 18.
[17] Reed, Scott, et al. "Generative adversarial text to image synthesis." arXiv preprint arXiv:1605.05396 (2016).
[18] Zhang, Han, et al. "Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017.
[19] Kim, Taeksoo, et al. "Learning to discover cross-domain relations with generative adversarial networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
[20] Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-consistent adversarial networks." Proceedings of the IEEE international conference on computer vision. 2017.
[21] Perarnau, Guim, et al. "Invertible conditional gans for image editing." arXiv preprint arXiv:1611.06355 (2016).
[22] Odena, Augustus, Christopher Olah, and Jonathon Shlens. "Conditional image synthesis with auxiliary classifier gans." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
[23] Deng, Jiankang, et al. "Arcface: Additive angular margin loss for deep face recognition." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
[24] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
[25] Ricanek, Karl, and Tamirat Tesafaye. "Morph: A longitudinal image database of normal adult age-progression." 7th International Conference on Automatic Face and Gesture Recognition (FGR06). IEEE, 2006.
[26] Chen, Bor-Chun, Chu-Song Chen, and Winston H. Hsu. "Cross-age reference coding for age-invariant face recognition and retrieval." European conference on computer vision. Springer, Cham, 2014.
[27] Or-El, Roy, et al. "Lifespan Age Transformation Synthesis." arXiv preprint arXiv:2003.09764 (2020).
[28] Figure-Eight API. Figure Eight Inc. https://appen.com/
[29] Face++ API. Megvii Inc. http://www.faceplusplus.com
[30] Bulat, Adrian, and Georgios Tzimiropoulos. "How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks)." Proceedings of the IEEE International Conference on Computer Vision. 2017.
[31] Li, Peipei, et al. "Global and local consistent age generative adversarial networks." 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 2018.
[32] Liu, Yunfan, Qi Li, and Zhenan Sun. "Attribute-aware face aging with wavelet-based generative adversarial networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
[33] Choi, Yunjey, et al. "Stargan: Unified generative adversarial networks for multi-domain image-to-image translation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
[34] Zhang, Zhifei, Yang Song, and Hairong Qi. "Age progression/regression by conditional adversarial autoencoder." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[35] Face of the future. Computer Science Dept. at Aberystwyth University, https://cherry.dcs.aber.ac.uk/Transformer/
[36] AgingBooth. PiVi & Co. https://apps.apple.com/us/app/agingbooth/id357467791
[37] FaceApp. FaceApp Inc. https://faceapp.com/app
[38] Viazovetskyi, Yuri, Vladimir Ivashkin, and Evgeny Kashin. "Stylegan2 distillation for feed-forward image manipulation." arXiv preprint arXiv:2003.03581 (2020).
[39] Cao, Qiong, et al. "Vggface2: A dataset for recognising faces across pose and age." 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018). IEEE, 2018.
[40] Todd, James T., et al. "The perception of human growth." Scientific american 242.2 (1980): 132-145.
[41] Lanitis, Andreas, Christopher J. Taylor, and Timothy F. Cootes. "Toward automatic simulation of aging effects on face images." IEEE Transactions on pattern Analysis and machine Intelligence 24.4 (2002): 442-455.
[42] Tiddeman, Bernard, Michael Burt, and David Perrett. "Prototyping and transforming facial textures for perception research." IEEE computer graphics and applications 21.5 (2001): 42-50.
[43] Kemelmacher-Shlizerman, Ira, Supasorn Suwajanakorn, and Steven M. Seitz. "Illumination-aware age progression." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
[44] Denton, Emily L., Soumith Chintala, and Rob Fergus. "Deep generative image models using a laplacian pyramid of adversarial networks." Advances in neural information processing systems. 2015.
[45] Nikitko, D.: Stylegan encoder for official tensorflow implementation. https:// github.com/Puzer/stylegan-encoder. 2019.

QR CODE