研究生: |
張詠茹 Yung-Ju Chang |
---|---|
論文名稱: |
應用立體光度法之人像光影重建於360度影像合成研究 Study on 360-degree Image Composition for Relit Portraits Based on Photometric Stereo Method |
指導教授: |
林宗翰
Tzung-Han Lin |
口試委員: |
歐立成
Li-Chen Ou 孫沛立 Pei-Li Sun 蔡燿全 Yao-Chuan Tsai |
學位類別: |
碩士 Master |
系所名稱: |
應用科技學院 - 色彩與照明科技研究所 Graduate Institute of Color and Illumination Technology |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 99 |
中文關鍵詞: | 影像合成 、光影重建 、人像攝影 、立體光度法 、法向量貼圖 、360度環境貼圖 、高動態範圍影像 、反向渲染 |
外文關鍵詞: | Image composition, Image relighting, Portrait image, Photometric stereo, Normal map, 360-degree image, HDRI, Inverse rendering |
相關次數: | 點閱:390 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
現今影像合成為了達到更加擬真的效果,須考量到非常多原因,例如影像的協調性、影像去背等。而近幾年開始,便出現影像合成與光影重建結合之研究,希望可以讓影像合成看起來更加逼真,但由於光影重建往往需要用到昂貴龐大的成本,因此影像合成與光影重建的結合領域仍有深入討論的空間。
本研究拍攝人像並利用架設好的綠幕進行影像去背,使用簡易的硬體拍攝設備降低成本,並利用立體光度法生成法向量貼圖。將法向量貼圖、去背過後含透明度之影像、360環境貼圖輸入至渲染器中,進行360環境貼圖之影像合成以及光影重建。
為了驗證影像合成的品質本研究共進行四項實驗,分別為實驗一:法向量貼圖的補強;實驗二:360環境貼圖合成滿意程度;實驗三:360環境貼圖光影重建;實驗四:鏡頭尺寸效應對合成的影像。其中實驗二及實驗三為利用人因實驗進行主觀評價分析,以李克特量表五分制的方式進行評分,收集30位受測者進行視覺感知的問卷評價,探討本研究影像合成的品質。
實驗一利用補繪的方式來將法向量貼圖之不連續性進行優化,雖無法提出全自動化的方式,但仍有將部份法向量貼圖成功優化。實驗二將8種人物及8種360環境貼圖並考量人的肢體動作差異,共合成128組影像,並針對128組合成影像進行主觀評價分析。實驗三則將2種去除色彩的人物以及2種360環境貼圖進行合成及渲染,針對去除色彩後光影重建的結果進行主觀評價的分析。實驗四將2種人物以及2種360環境貼圖並設置2種虛擬相機的焦段,針對合成渲染的結果進行觀測,探討虛擬相機對合成影像是否有影響。
經統計分析實驗二的整體合成影像的滿意程度在平均之上(李克特量表五分制>3分)。在本研究架構下,法向量貼圖的重建品質的效果較佳,則影像合成的整體滿意度則也會較高;無動作之人像影像與360環境貼圖進行合成時整體滿意度較高;利用光線比較擴散的360環境貼圖進行合成時,整體視覺效果會比較好;暗色之色彩在進行光影重建時效果較差;無色彩之光影重建會去除材質的反射率問題因此效果較好;虛擬之鏡頭焦段並不會對影像合成上有負面影響。
In order to have more realistic effects in modern image composition, there are many reasons to be considered, such as image harmonization and image matting. In recent years, there were several research topics on the combination of image composition and relighting, which intended to make the image composition more realistic. However, image relighting costs a lot in computing. Therefore, there is still space to improve, particularly in combination of image composition and relighting.
In this study, portrait photo booth was developed and a green screen was used to remove the backgrounds of portrait images. Based on this simple and inexpensive hardware, we utilized a simplified photometric stereo method to generate normal maps. After photography, we exported the color portrait image with transparent background and its corresponding normal map to blender software. In addition, a 360-degree image was preset as an environmental texture in the rendering secne of blender software. As a result, we performed image composition under various lighting effect in the 360-degree image.
In order to verify the quality of image composition, four experiments were conducted in this study. Experiment 1: improvement of normal map; Experiment 2: subjective satisfaction of portrait image composition in 360-degree images; Experiment 3: relighting of 360-degree image; Experiment 4: size effect of virtual field of view (FOV) on image composition. Experiment 2 and Experiment 3 were based on the subjective questionnaire experiment, which was scored with a five-point Likert scale. There were 30 participants in visual perception questionnaire. Finally we conducted overall analysis of the images composition.
In Experiment 1, the discontinuity of the normal map is optimized by the method of inpainting. Although an automatic method was not involved, in part of normal maps were still successfully optimized. Experiment 2 combined eight portrait photos with two different poses and eight kinds of 360-degree images. And there are 128 sets of experimental images, and subjective evaluation and analysis were carried out for the 128 sets experimental images. In Experiment 3, two kinds of color-free portraits and two kinds of 360-degree images were composited, and the results of relighting after decolorization were analyzed for subjective evaluation. In Experiment 4, two portrait photos and two kinds of 360-degree images were set, and the focal lengths of two lens types of virtual cameras were set. The results of composition rendering were observed to explore whether the virtual camera had an influence on the composited image.
After statistical analysis, the overall satisfaction of image composition in Experiment 2 was above average (5-point Likert scale > 3 points). Under the framework of this research, the effect of the normal map performed good, and the overall satisfaction of the image composition was also above acceptable; The overall satisfaction of portrait images in the neutral poses were better than those with non- neutral poses; The composited images with uniform-distributed light performed better results that those without uniform-distributed light; The conditions of the subject with dim color apparel had less effective; Color-free relit images would remove the reflectivity problem of the material, so the effect of image composition looked better; The virtual focal lens length did not have a negative impact on image composition.
[1] H.-Y. Chang and T.-H. Lin, “Portrait imaging relighting system based on a simplified photometric stereo method,” Appl. Opt., vol. 61, no. 15, pp. 4379–4386, 2022, doi: 10.1364/ao.451662.
[2] L. Niu et al., “Making Images Real Again: A Comprehensive Survey on Deep Image Composition,” 2021, [Online]. Available: http://arxiv.org/abs/2106.14490.
[3] A. R. Smith and J. F. Blinn, “Blue screen matting,” in Proceedings of the ACM SIGGRAPH Conference on Computer Graphics, 1996, pp. 259–268.
[4] Y. Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski, “A Bayesian approach to digital matting,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2001, vol. 2, pp. 264–271, doi: 10.1109/cvpr.2001.990970.
[5] J. Sun, J. Jia, C. K. Tang, and H. Y. Shum, “Poisson matting,” in Proceedings of the ACM SIGGRAPH Conference on Computer Graphics, 2004, no. 1, pp. 315–321, doi: 10.1145/1186562.1015721.
[6] N. Xu, B. Price, S. Cohen, and T. Huang, “Deep image matting,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2017, vol. 2017-Janua, no. 1, pp. 2970–2979, doi: 10.1109/CVPR.2017.41.
[7] S. Sengupta, V. Jayaram, B. Curless, S. Seitz, and I. Kemelmacher-Shlizerman, “Background matting: The world is your green screen,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, pp. 2291–2300, doi: 10.1109/CVPR42600.2020.00236.
[8] J. Liu et al., “Boosting semantic human matting with coarse annotations,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2020, pp. 8563–8572, doi: 10.1109/CVPR42600.2020.00859.
[9] E. Reinhard, M. Ashikhmin, B. Gooch, and P. Shirley, “Color transfer between images,” IEEE Comput. Graph. Appl., vol. 21, no. 5, pp. 34–41, 2001, doi: 10.1109/38.946629.
[10] J. F. Lalonde and A. A. Efros, “Using color compatibility for assessing image realism,” in Proceedings of the IEEE International Conference on Computer Vision, 2007, pp. 1–8, doi: 10.1109/ICCV.2007.4409107.
[11] K. Sunkavalli, M. K. Johnson, W. Matusik, and H. Pfister, “Multi-scale image harmonization,” ACM Trans. Graph., vol. 29, no. 4, pp. 1–10, 2010, doi: 10.1145/1778765.1778862.
[12] S. Xue, A. Agarwala, J. Dorsey, and H. Rushmeier, “Understanding and improving the realism of image composites,” ACM Trans. Graph., vol. 31, no. 4, pp. 1–10, 2012, doi: 10.1145/2185520.2185580.
[13] Y. H. Tsai, X. Shen, Z. Lin, K. Sunkavalli, X. Lu, and M. H. Yang, “Deep image harmonization,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2017, vol. 2017-Janua, pp. 3789–3797, doi: 10.1109/CVPR.2017.299.
[14] B. C. Chen and A. Kae, “Toward realistic image compositing with adversarial learning,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, vol. 2019-June, pp. 8415–8424, doi: 10.1109/CVPR.2019.00861.
[15] W. Cong et al., “DoveNet: Deep Image Harmonization via Domain Verification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8394–8403, doi: 10.1109/CVPR42600.2020.00842.
[16] K. Sofiiuk, P. Popenova, and A. Konushin, “Foreground-aware semantic representations for image harmonization,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2021, pp. 1619–1628, doi: 10.1109/WACV48630.2021.00166.
[17] M. Okabe, G. Zeng, Y. Matsushita, T. Igarashi, L. Quan, and H. Y. Shum, “Single-view relighting with normal map painting,” Proc. Pacific Graph., pp. 27–34, 2006, [Online]. Available: http://www-ui.is.s.u-tokyo.ac.jp/~takeo/papers/okabe_pg2006_relighting.pdf.
[18] D. Sýkora et al., “Ink-and-Ray: Bas-Relief meshes for adding global illumination effects to hand-drawn characters,” ACM Trans. Graph., vol. 33, no. 2, 2014, doi: 10.1145/2591011.
[19] D. Shahlaei, M. Piotraschke, and V. Blanz, “Lighting design for portraits with a virtual light stage,” in Proceedings of IEEE International Conference on Image Processing, 2016, vol. 2016-Augus, pp. 1579–1583, doi: 10.1109/ICIP.2016.7532624.
[20] R. J. Woodham, “Photometric Method For Determining Surface Orientation From Multiple Images,” Opt. Eng., vol. 19, no. 1, pp. 139–144, 1980, doi: 10.1117/12.7972479.
[21] F. Solomon and K. Ikeuchi, “Extracting the shape and roughness of specular lobe objects using four light photometric stereo,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 4, pp. 466–471, 1996, doi: 10.1109/34.491627.
[22] A. El Gendy and A. Shalaby, “Mean profile depth of pavement surface macrotexture using photometric stereo techniques,” J. Transp. Eng., vol. 133, no. 7, pp. 433–440, 2007, doi: 10.1061/(ASCE)0733-947X(2007)133:7(433).
[23] J. Riviere, P. Gotardo, D. Bradley, A. Ghosh, and T. Beeler, “Single-shot high-quality facial geometry and skin appearance capture,” ACM Trans. Graph., vol. 39, no. 4, pp. 1–12, 2020, doi: 10.1145/3386569.3392464.
[24] P. Debevec, T. Hawkins, C. Tchou, H. P. Duiker, W. Sarokin, and M. Sagar, “Acquiring the reflectance field of a human face,” in Proceedings of the ACM SIGGRAPH Conference on Computer Graphics, 2000, pp. 145–156, doi: 10.1145/344779.344855.
[25] P. Debevec, A. Wenger, C. Tchou, A. Gardner, J. Waese, and T. Hawkins, “A lighting reproduction approach to live-action compositing,” ACM Trans. Graph., vol. 21, no. 3, pp. 547–556, 2002, doi: 10.1145/566570.566614.
[26] K. Guo et al., “The relightables: Volumetric performance capture of humans with realistic relighting,” ACM Trans. Graph., vol. 38, no. 6, pp. 1–19, 2019, doi: 10.1145/3355089.3356571.
[27] H. Zhou, S. Hadap, K. Sunkavalli, and D. Jacobs, “Deep single-image portrait relighting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, vol. 2019-Octob, pp. 7194–7202, doi: 10.1109/ICCV.2019.00729.
[28] A. Meka et al., “Deep reflectance fields: high-quality facial reflectance field inference from color gradient illumination,” ACM Trans. Graph., vol. 38, no. 4, pp. 1–12, 2019, doi: 10.1145/3306346.3323027.
[29] T. Sun et al., “Single image portrait relighting,” ACM Trans. Graph., vol. 38, no. 4, pp. 1–12, 2019, doi: 10.1145/3306346.3323008.
[30] A. Sevastopolsky, S. Ignatiev, G. Ferrer, E. Burnaev, and V. Lempitsky, “Relightable 3D Head Portraits from a Smartphone Video,” 2020, [Online]. Available: http://arxiv.org/abs/2012.09963.
[31] Z. Wang, X. Yu, M. Lu, Q. Wang, C. Qian, and F. Xu, “Single image portrait relighting via explicit multiple reflectance channel modeling,” ACM Trans. Graph., vol. 39, no. 6, pp. 1–13, 2020, doi: 10.1145/3414685.3417824.
[32] C. Legendre et al., “Learning Illumination from Diverse Portraits,” in SIGGRAPH Asia 2020 Technical Communications, 2020, pp. 1–4, doi: 10.1145/3410700.3425432.
[33] R. Pandey et al., “Total Relighting: Learning to Relight Portraits for Background Replacement,” ACM Trans. Graph., vol. 40, no. 1, pp. 1–21, 2021, [Online]. Available: https://doi.org/10.1145/3450626.3459872.
[34] X. Zhang et al., “Neural Light Transport for Relighting and View Synthesis,” ACM Trans. Graph., vol. 40, no. 1, 2021, doi: 10.1145/3446328.
[35] G. Song, T. J. Cham, J. Cai, and J. Zheng, “Half-body Portrait Relighting with Overcomplete Lighting Representation,” Comput. Graph. Forum, vol. 40, no. 6, pp. 371–381, 2021, doi: 10.1111/cgf.14384.
[36] M. Lagunas et al., “Single-image Full-body Human Relighting,” in Eurographics Symposium on Rendering, 2021, Eurographics Association, pp. 167–177, doi: 10.2312/sr.20211300.
[37] D. Tajima, Y. Kanamori, and Y. Endo, “Relighting Humans in the Wild: Monocular Full-Body Human Relighting with Domain Adaptation,” Comput. Graph. Forum, vol. 40, no. 7, pp. 205–216, 2021, doi: 10.1111/cgf.14414.
[38] J. T. Kajiya, “The rendering equation,” in Proceedings of the ACM SIGGRAPH annual conference on Computer graphics and interactive techniques, 1986, vol. 20, no. 4, pp. 143–150, doi: 10.1145/15922.15902.
[39] 張欣媛,“應用簡化立體光度法於肖像光影重建系統”,國立臺灣科技大學碩士論文,2021。
[40] A. Telea, “An Image Inpainting Technique Based on the Fast Marching Method,” J. Graph. Tools, vol. 9, no. 1, pp. 23–34, 2004, doi: 10.1080/10867651.2004.10487596.
[41] Poly Haven. https://polyhaven.com/ (access at 2022/7)
[42] A. Joshi, S. Kale, S. Chandel, and D. Pal, “Likert Scale: Explored and Explained,” Br. J. Appl. Sci. Technol., vol. 7, no. 4, pp. 396–403, 2015, doi: 10.9734/bjast/2015/14975.