簡易檢索 / 詳目顯示

研究生: 黃立邦
Li-Bang Huang
論文名稱: 由童年至老年之大區間人臉年齡轉換
Facial Age Transformation between Childhood and Seniorhood
指導教授: 鍾聖倫
Sheng-Luen Chung
徐繼聖
Gee-Sern Jison Hsu
口試委員: 洪一平
Yi-Ping Hung
莊永裕
Yung-Yu Chuang
鄭文皇
Wen-Huang Cheng
徐繼聖
Gee-Sern Jison Hsu
鍾聖倫
Sheng-Luen Chung
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 72
中文關鍵詞: 年齡轉換StarGAN v2Face++ArcFace
外文關鍵詞: age transformation, StarGAN v2, Face++, ArcFace
相關次數: 點閱:176下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

人臉圖像全年域之年齡轉換所要解決的問題是,由一張人臉圖像生成同一人
較年輕,以及老年時的長像。轉換過程中必需反應出人臉在不同年齡階段,頭形
以及臉上紋理的變化。解決此問題的難處在於無法大規模地針對同一人進行全年
域的圖像收集以當作有效的訓練集;並且,由於欠缺多量有效的全年域的真實樣
本,難以進行客觀有效的效能評估。據此,針對全年齡域人臉生成的年齡轉換任
務,本論文在 StarGAN v2 的基礎上,將年齡設定為 target domain 的屬性,並進
行以下的調整:(一) 讓 Generator 以 Skip Connection 加入能聚焦人臉五官輪廓
的 FAN 子網路,並利用權重的方式調整匯入生成器中解碼器的代表身份特徵的
比例;(二) 同時在訓練時加入比對原始影像與生成影像之身分的 Light CNN 子
網路。此新的網路能有效解耦人臉圖像中:年齡、身份、以及其他風格相關的特
徵。(三) 最後,透過將原始影像同時設定為風格影像以排除非年齡相關的其他風
格特徵涉入生成圖像的流程,以維持原始影像中的身分特徵,而得到較好的身分
與年齡解耦的效果。與目前最好的人臉年齡轉換技術 LATS 作對比,我們同樣使
用 FFHQ-aging dataset,並以 Face++ 與 ArcFace 針對年齡轉換後之生成圖像,
進行大規模的年齡估測與身分辨識。實驗結果證實,本論文所提出的網路架構有
更好的特徵解耦能力,在生成影像品質、預期年齡轉換效果,以及身分保留效果
上皆優於 LATS。


This study aims to generate a facial image reflecting how a person might look like in the future, or looked like in the past, based on a single photo. The transformation has to take into account of the changes in head shape and in facial texture presented at different ages. Nonetheless, the task is challenging in that it is impossible to collect a training set containing identity specific facial images throughout lifetime on a large scale. Furthermore, due to the lack of sufficient facial images covering a whole lifespan, it is difficult to conduct systematic performance evaluation. In this regard, taking advantage of the excellent performance of StarGAN v2 in style transfer, where the generator can be supervised by the discriminator for the generation of desired target style, this study treats age transformation as a transformation of facial aging patterns. In our adapted StarGAN v2-based solution, the age is specified by the target domain. Three adaptations have been made: (1) Adjustable weight for the adaptive wing based heatmap, (2) Identity loss by Light CNN during training phase, and (3) Exclusion of non-age related feature by duplicating source image for reference image. To compare with the state-of-the-art LATS, we use the FFHQ-aging dataset for training and inferencing, and use Face++ and ArcFace to perform in large-scale age estimation and identity recognition on the age-transformed images. Experimental results confirm the proposed solution’s superiority over LATS in terms of image quality, intended age rendition effect, and identity preservation.

摘要 I ABSTRACT II 目錄 IV 圖目錄 VII 表目錄 IX 第1章、 緒論 1 1.1 研究動機 1 1.2 論文貢獻 2 1.3 論文架構 3 第2章、 相關文獻審閱 4 2.1 High-quality Image Generation 4 2.1.1 PG-GAN 4 2.1.2 StyleGAN 4 2.2 Image-to-Image Translation 6 2.2.1 CycleGAN 6 2.2.2 StarGAN 7 2.3 StarGAN v2 8 2.3.1 StarGAN v2網路架構 8 2.3.2 損失函數 9 2.3.3 FAN 與 Skip Connection 10 2.4 LATS 12 2.4.1 網路架構 12 2.4.2 損失函數 13 2.5 Face Recognition 14 2.5.1 Light CNN 14 2.5.2 ArcFace 15 第3章、 研究方法 16 3.1 總覽 16 3.2 身分資訊保留機制 18 3.2.1 FAN 與 Skip Connection Mechanism 18 3.2.2 Light CNN的身分保留 20 3.3 訓練方法 21 3.3.1 訓練步驟 21 3.3.2 實作細節 25 第4章、 實驗與討論 27 4.1 實驗資料集 27 4.2 身分保留效能評估 29 4.3 年齡轉換效能評估 37 4.4 本論文方法與真實情況之比較 44 第5章、 與LATS之比較 47 5.1 資料集之使用方式 47 5.2 損失函數 48 5.3 特徵解耦特性 49 5.4 效能評估方式 50 5.5 圖片視覺效果比較 51 第6章、 結論 53 參考文獻 55 附錄 57

[1] I. J. Goodfellow et al., "Generative adversarial nets," in Advances in Neural Information Processing Systems, 2014, vol. 3, pp. 2672-2680.
[2] X. Huang and S. Belongie, "Arbitrary Style Transfer in Real-Time with Adaptive Instance Normalization," in Proceedings of the IEEE International Conference on Computer Vision, 2017, vol. 2017-October, pp. 1510-1519.
[3] J. Y. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks," in Proceedings of the IEEE International Conference on Computer Vision, 2017, vol. 2017-October, pp. 2242-2251.
[4] Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, and J. Choo, "StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018, pp. 8789-8797.
[5] Y. Choi, Y. Uh, J. Yoo, and J.-W. Ha, "Stargan v2: Diverse image synthesis for multiple domains," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8188-8197.
[6] A. Bulat and G. Tzimiropoulos, "How Far are We from Solving the 2D & 3D Face Alignment Problem? (and a Dataset of 230,000 3D Facial Landmarks)," in Proceedings of the IEEE International Conference on Computer Vision, 2017, vol. 2017-October, pp. 1021-1030.
[7] X. Wang, L. Bo, and L. Fuxin, "Adaptive wing loss for robust face alignment via heatmap regression," in Proceedings of the IEEE International Conference on Computer Vision, 2019, vol. 2019-October, pp. 6970-6980.
[8] R. Or-El, S. Sengupta, O. Fried, E. Shechtman, and I. Kemelmacher-Shlizerman, "Lifespan Age Transformation Synthesis," in Proceedings of the European Conference on Computer Vision, 2020.
[9] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, "Analyzing and improving the image quality of stylegan," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8110-8119.
[10] X. Huang, M. Y. Liu, S. Belongie, and J. Kautz, "Multimodal Unsupervised Image-to-Image Translation," in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) vol. 11207 LNCS, ed, 2018, pp. 179-196.
[11] W. Liu, Y. Wen, Z. Yu, M. Li, B. Raj, and L. Song, "SphereFace: Deep hypersphere embedding for face recognition," in Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, 2017, vol. 2017-January, pp. 6738-6746.
[12] X. Wu, R. He, Z. Sun, and T. Tan, "A light CNN for deep face representation with noisy labels," IEEE Transactions on Information Forensics and Security, Article vol. 13, no. 11, pp. 2884-2896, 2018.
[13] B. C. Chen, C. S. Chen, and W. H. Hsu, "Face recognition and retrieval using cross-age reference coding with cross-age celebrity dataset," IEEE Transactions on Multimedia, Article vol. 17, no. 6, pp. 804-815, 2015, Art. no. 7080893.
[14] T. Karras, T. Aila, S. Laine, and J. Lehtinen, "Progressive growing of GANs for improved quality, stability, and variation," in 6th International Conference on Learning Representations, ICLR 2018 - Conference Track Proceedings, 2018.
[15] T. Karras, S. Laine, and T. Aila, "A style-based generator architecture for generative adversarial networks," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, vol. 2019-June, pp. 4396-4405.
[16] C. H. Lee, Z. Liu, L. Wu, and P. Luo, "MaskGAN: Towards Diverse and Interactive Facial Image Manipulation," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, pp. 5548-5557.
[17] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, "ArcFace: Additive angular margin loss for deep face recognition," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, vol. 2019-June, pp. 4685-4694.
[18] M. Inc. Face++ API. Available: http://www.faceplusplus.com
[19] Z. Wang, X. Tang, W. Luo, and S. Gao, "Face Aging with Identity-Preserved Conditional Generative Adversarial Networks," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018, pp. 7939-7947.
[20] H. Yang, D. Huang, Y. Wang, and A. K. Jain, "Learning Face Age Progression: A Pyramid Architecture of GANs," in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2018, pp. 31-39.

無法下載圖示 全文公開日期 2026/02/01 (校內網路)
全文公開日期 2026/02/01 (校外網路)
全文公開日期 2026/02/01 (國家圖書館:臺灣博碩士論文系統)
QR CODE