簡易檢索 / 詳目顯示

研究生: 戴廷宇
Ting-Yu Tai
論文名稱: 以殘差圖卷積網路進行三維人臉重建
Residual Graph Convolutional Networks for 3D Face Reconstruction
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 鍾聖倫
Sheng-Luen Chung
林嘉文
Chia-Wen Lin
賴尚宏
Shang-Hong Lai
林惠勇
Huei-Yung Lin
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 69
中文關鍵詞: 三維人臉重建圖卷積網路
外文關鍵詞: 3D Face Reconstruction, Graph Convolutional Networks
相關次數: 點閱:186下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

三維人臉重建任務是透過單個或多個二維圖像進行對人臉幾何的重建。近期有許多方法結合三維人臉形變模型與深度卷積神經網路,透過使用深度卷積神經網路進行預測對應二維圖像的三維人臉形變模型參數,該種方法有助於在不受約束的環境下重建出良好的三維人臉模型。然而,由於三維人臉掃描數據集有限,對於使用單一圖像進行重建三維人臉模型以及紋理仍然有挑戰性。為了解決上述的問題,我們提出一個新穎的框架,考慮非線性三維幾何形狀以及表面平滑度,我們設計三維人臉形變模型模組 (Linear Parametric Module, LPM) 和三維人臉模型生成模組 (3D Face Generation Module, 3D-FGM) 進行結合,來達成三維人臉重建任務。LPM取自預訓練模型,可以根據二維圖像對應之3DMM參數生成三維人臉模型。3D-FGM結合人臉編碼器以及圖卷積網路 (GCN),生成另一個高品質三維人臉模型。以LPM生成之三維人臉模型作為目標,進行優化3D-FGM效能以生成品質更高的三維人臉模型。由於3D-FGM架構屬於非線性模型,可以更有效的生成對應的三維人臉模型。因此基於3D-FGM生成之三維人臉模型品質相較於LPM表現更為優異。本方法在AFLW2000-3D、MICC Florence 3D Faces資料庫中展現優異的競爭力。


3D face reconstruction aims to recover the 3D geometry of a face from single or multiple 2D images. Many recent approaches combine the 3D Morphable Model (3DMM) and the Deep Convolutional Neural Network (DCNN) to tackle 3D face reconstruction. However, the accuracy of the 3D reconstructed facial shape from a monocular image still has a large room for improvement. We propose a framework to couple a Linear Parametric Model (LPM) and a 3D Face Generation Module (3D-FGM) with an objective that considers the nonlinear 3D facial geometry and surface smoothness for 3D face reconstruction. The LPM, taken from an off-the-shelf pretrained model, can generate the 3D shape and texture in terms of 3DMM coefficients for a 2D face. The 3D-FGM combines a face encoder and a Graph Convolutional Network (GCN) to generate another 3D shape for the 2D face. Taking the LPM-generated 3D shape as a reference, the 3D-FGM-based 3D shape will be iteratively improved by updating the GCN parameters during training. As the 3D-FGM is by nature a nonlinear model that can better capture the nonlinearity of a face shape, the 3D-FGM -based 3D shape will be trained to outperform the LPM-generated 3D shape in reconstructing the 3D shape of the target face. Experiments on the AFLW2000-3D and MICC Florence 3D Faces datasets show that the proposed approach delivers a better performance than state-of-the-art methods.

摘要 Abstract 誌謝 目錄 圖目錄 表目錄 第1章 介紹 1.1 研究背景和動機 1.2 方法概述 1.3 論文貢獻 1.4 論文架構 第2章 文獻回顧 2.1 3DMM 2.2 VRN 2.3 3DDFA 2.4 Nonlinear-3DMM 2.5 PRNet 2.6 Deep3DFace 第3章 主要方法 3.1 整體網路架構 3.2 線性三維人臉形變模型模組設計 3.3 三維人臉生成模組 (3D-FGM) 第4章 實驗設置與分析 4.1 資料庫介紹 4.1.1 Large-scale CelebFaces Attributes Database 4.1.2 300W-LP Database 4.1.3 MICC Florence 3D Faces Database 4.1.4 AFLW2000-3D Database 4.2 實驗設置 4.2.1 資料劃分、設置 4.2.2 效能評估指標 4.2.3 實驗設計 4.3 實驗結果與分析 4.3.1 三維人臉生成模組之損失函數設置比較 4.3.2 三維人臉生成模組之三維人臉生成模組設置比較 4.3.3 三維人臉生成模組對生成三維人臉模型之影響 4.4 與相關文獻之比較 第5章 結論與未來研究方向 第6章 參考文獻

[1] Volker Blanz and Thomas Vetter. A morphable model for the synthesis of 3d faces. In Proceedings of the 26th annual conference on Computer graphics and interactive techniques, 1999.
[2] James Booth, Epameinondas Antonakos, Stylianos Ploumpis, George Trigeorgis, Yannis Panagakis, and Stefanos Zafeiriou. 3d face morphable models” in-thewild”. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
[3] Adrian Bulat and Georgios Tzimiropoulos. How far are we from solving the 2d & 3d face alignment problem?(and a dataset of 230,000 3d facial landmarks). Proceedings of the IEEE International Conference on Computer Vision. 2017.
[4] Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3), 2013.
[5] Kaidi Cao, Yu Rong, Cheng Li, Xiaoou Tang, and Chen Change Loy. Pose-robust face recognition via deep residual equivariant mapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[6] Shiyang Cheng, Georgios Tzimiropoulos, Jie Shen, and Maja Pantic. Faster, better and more detailed: 3d face reconstruction with graph convolutional networks. In Proceedings of the Asian Conference on Computer Vision, 2020.
[7] Micha¨el Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. arXiv preprint arXiv:1606.09375, 2016.
[8] Yu Deng, Jiaolong Yang, Sicheng Xu, Dong Chen, Yunde Jia, and Xin Tong. Accurate 3d face reconstruction with weakly-supervised learning: From single image to image set. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.
[9] Yao Feng, Fan Wu, Xiaohu Shao, Yanfeng Wang, and Xi Zhou. Joint 3d face reconstruction and dense alignment with position map regression network. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
[10] Baris Gecer, Stylianos Ploumpis, Irene Kotsia, and Stefanos Zafeiriou. Ganfit: Generative adversarial network fitting for high fidelity 3d face reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1155–1164, 2019.
[11] Kyle Genova, Forrester Cole, Aaron Maschinot, Aaron Sarna, Daniel Vlasic, and William T Freeman. Unsupervised training for 3d morphable model regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[12] Xavier Glorot, Antoine Bordes, and Yoshua Bengio. Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2011.
[13] Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei, and Stan Z Li. Towards fast, accurate and stable 3d dense face alignment. arXiv preprint arXiv:2009.09960, 2020.
[14] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
[15] Aaron S Jackson, Adrian Bulat, Vasileios Argyriou, and Georgios Tzimiropoulos. Large pose 3d face reconstruction from a single image via direct volumetric cnn regression. In Proceedings of the IEEE international conference on computer vision, 2017.
[16] Guohao Li, Matthias Muller, Ali Thabet, and Bernard Ghanem. Deepgcns: Can gcns go as deep as cnns? In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
[17] M. J. Jones and J. M. Rehg. Statistical color models with application to skin detection. International Journal of Computer Vision (IJCV), 46(1):81–96, 2002.
[18] Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), December 2015.
[19] Pascal Paysan, Reinhard Knothe, Brian Amberg, Sami Romdhani, and Thomas Vetter. A 3d face model for pose and illumination invariant face recognition. In 2009 sixth IEEE international conference on advanced video and signal based surveillance. Ieee, 2009.
[20] Anurag Ranjan, Timo Bolkart, Soubhik Sanyal, and Michael J Black. Generating 3d faces using convolutional mesh autoencoders. In Proceedings of the European Conference on Computer Vision (ECCV), pages 704–720, 2018.
[21] Christos Sagonas, Georgios Tzimiropoulos, Stefanos Zafeiriou, and Maja Pantic. 300 faces in-the-wild challenge: The first facial landmark localization challenge. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 397–403, 2013.
[22] Andrew D Bagdanov, Alberto Del Bimbo, and Iacopo Masi. The florence 2d/3d hybrid face dataset. In Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding, pages 79–80. ACM, 2011.
[23] Florian Schroff, Dmitry Kalenichenko, and James Philbin. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2015.
[24] Ayush Tewari, Michael Zollh¨ofer, Pablo Garrido, Florian Bernard, Hyeongwoo Kim, Patrick P´erez, and Christian Theobalt. Self-supervised multi-level face model learning for monocular reconstruction at over 250 hz. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
[25] Ayush Tewari, Michael Zollhofer, Hyeongwoo Kim, Pablo Garrido, Florian Bernard, Patrick Perez, and Christian Theobalt. Mofa: Model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In Proceedings of the IEEE International Conference on Computer Vision Workshops, 2017.
[26] Luan Tran, Feng Liu, and Xiaoming Liu. Towards highfidelity nonlinear 3d face morphable model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1126–1135, 2019.
[27] Luan Tran and Xiaoming Liu. Nonlinear 3d face morphable model. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2018.
[28] Luan Tran and Xiaoming Liu. On learning 3d face morphable model from in-the-wild images. IEEE transactions on pattern analysis and machine intelligence, 43(1), 2019.
[29] Anh Tuan Tran, Tal Hassner, Iacopo Masi, and G´erard Medioni. Regressing robust and discriminative 3d morphable models with a very deep neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
[30] M. Kostinger, P. Wohlhart, P. M. Roth, and H. Bischof. ¨ Annotated facial landmarks in the wild: A large-scale, realworld database for facial landmark localization. In Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, pages 2144–2151. IEEE, 2011.
[31] Guangming Yao, Yi Yuan, Tianjia Shao, and Kun Zhou. Mesh guided one-shot face reenactment using graph convolutional networks. In Proceedings of the 28th ACM International Conference on Multimedia, 2020.
[32] Hang Zhou, Jihao Liu, Ziwei Liu, Yu Liu, and Xiaogang Wang. Rotate-and-render: Unsupervised photorealistic face rotation from single-view images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
[33] Xiangyu Zhu, Zhen Lei, Xiaoming Liu, Hailin Shi, and Stan Z Li. Face alignment across large poses: A 3d solution. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016
[34] Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." European conference on computer vision. Springer, Cham, 2016.
[35] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[36] X. Zhu and D. Ramanan. Face detection, pose estimation, and landmark localization in the wild. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 2879–2886. IEEE, 2012.
[37] P. N. Belhumeur, D. W. Jacobs, D. Kriegman, and N. Kumar. Localizing parts of faces using a consensus of exemplars. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pages 545–552. IEEE, 2011.
[38] E. Zhou, H. Fan, Z. Cao, Y. Jiang, and Q. Yin. Extensive facial landmark localization with coarse-to-fine convolutional network cascade. In Computer Vision Workshops (ICCVW), 2013 IEEE International Conference on, pages 386–391. IEEE, 2013.
[39] K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre. XM2VTSDB: The extended M2VTS database. In Second international conference on audio and video-based biometric person authentication, volume 964, Citeseer, 1999.
[40] Tu, Xiaoguang, et al. "3d face reconstruction from a single image assisted by 2d face images in the wild." IEEE Transactions on Multimedia 23 (2020): 1160-1172.
[41] Zeng, Xiaoxing, Xiaojiang Peng, and Yu Qiao. "DF2Net: A dense-fine-finer network for detailed 3D face reconstruction." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.
[42] Zafeiriou, S., Chrysos, G.G., Roussos, A., Ververas, E., Deng, J., Trigeorgis, G.: The 3d menpo facial landmark tracking challenge. In: ICCV (2017).
[43] Yin, Lijun, et al. "A 3D facial expression database for facial behavior research." 7th international conference on automatic face and gesture recognition (FGR06). IEEE, 2006.
[44] Cheng, Shiyang, et al. "4dfab: A large scale 4d facial expression database for biometric applications." arXiv preprint arXiv:1712.01443 (2017).
[45] Piao, Jingtan, et al. "Inverting Generative Adversarial Renderer for Face Reconstruction." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
[46] Thomas Gerig, Andreas Morel-Forster, Clemens Blumer, Bernhard Egger, Marcel Lüthi, Sandro Schönborn and Thomas Vetter Morphable Face Models - An Open Framework IN: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), Xi'an, 2018, pp. 75-82.
[47] Deng, Jiankang, et al. "Arcface: Additive angular margin loss for deep face recognition." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019.

QR CODE