Basic Search / Detailed Display

Author: 余知寰
Chih-Huan Yu
Thesis Title: 學習在函數空間中增強單視圖重建3D模型細節之研究
Single-View 3D Model Reconstruction with Enhanced Details Using Deep Convolutional Neural Network in Function Space
Advisor: 吳怡樂
Yi-Leh Wu
Committee: 陳建中
Jiann-Jone Chen
唐政元
Cheng-Yuan Tang
閻立剛
Li-Kang Yen
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2020
Graduation Academic Year: 108
Language: 英文
Pages: 40
Keywords (in Chinese): 占用網路單視圖3D重建
Keywords (in other languages): Occupancy Networks, 3D Model Reconstruction, CNN
Reference times: Clicks: 185Downloads: 11
Share:
School Collection Retrieve National Library Collection Retrieve Error Report

最近,基於3D重建學習方法越來越流行,與傳統多視圖重建算法不同,僅已單視圖重建3D圖像,更具有挑戰性。現行流行的表示方法大概可以分為3種: 基於點雲的表示、基於體素的表示以及網格的表示。在本文中,我們改進了一個新型的深度卷積神經網路架構Occupancy Networks[1]。將網路的Encoder替換成新的方法EfficientNet,接下來將Decoder更改成不同的架構DenseNet。使用ShapeNet[6]資料集裡面的飛機、車子和椅子3個類別作為實驗。我們將資料分割成訓練集、驗證集以及測試集。實驗表明了,我們提出的新型網路不管在視覺方面,還是定性的數據的比較上都有比原本的Occupancy Networks更加好的成績。


Recently, learning methods based on 3D reconstruction have become increasingly
popular. Unlike traditional multi-view reconstruction algorithms, single-view reconstruction of 3D images is more challenging. Presently popular representation methods can be roughly divided into three types: point cloud-based representation, voxel-based representation and grid representation. In this paper, we improve a new type of deep convolutional neural network architecture Occupancy Networks. We replace the Encoder of the network with the new method EfficientNet, and then change the Decoder to a different architecture DenseNet. We use the ShapeNet dataset in the three categories of planes, cars, and chairs as experiments. The data set is split into training set, validation set, and test set. Experiments show that our proposed network outperforms the Occupancy Networks in both visual and quantitative comparison.

論文摘要 iii Abstract iv Contents v List of Figures vi List of Tables vii Chapter 1. Introduction 8 Chapter 2 Related Work and Review 10 2.1 Image feature extraction 10 2.2 Different CNN network models 10 2.3 Introduction to Occupancy networks 11 Chapter 3 Proposed Method 15 3.1 Structure of the encoder 15 3-2 Structure of the decoder 17 3.3 Generate 19 Chapter 4 Experiments 20 4.1 Datasets and Metrics 20 4.2 Comparison of ONet and Encoder replace by EfficientNet 20 4.3 Comparison of ONet and Decoder replace by DenseNet 26 4.4 Compare ONet with our proposed network 31 4.4 Reconstruction failures 35 Chapter 5 Conclusions and Future Work 37 Reference 38

[1] Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., & Geiger, A. “Occupancy networks: Learning 3d reconstruction in function space” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4460-4470), 2019.
[2] He, K., Zhang, X., Ren, S., & Sun, J. “Deep residual learning for image recognition.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).2016.
[3] Tan, M., & Le, Q. V. “Efficientnet: Rethinking model scaling for convolutional neural networks” arXiv preprint arXiv:1905.11946, 2019.
[4] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. “Densely connected convolutional networks” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708), 2017.
[5] De Vries, H., Strub, F., Mary, J., Larochelle, H., Pietquin, O., & Courville, A. C. “Modulating early visual processing by language” In Advances in Neural Information Processing Systems (pp. 6594-6604), 2017.
[6] Chang, A. X., Funkhouser, T. A., Guibas, L. J., Hanrahan, P., Huang, Q., Li, Z., Savarese, S., Savva, M., Song, S., Su, H., Xiao, J., Yi, L., and Yu, F. “ShapeNet: An information-rich 3D model repository” arXiv.org, 1512.03012, 2015.
[7] Choy, C. B., Xu, D., Gwak, J., Chen, K., & Savarese, S.. “3d-r2n2: A unified approach for single and multi-view 3d object reconstruction.” In European conference on computer vision (pp. 628-644). Springer, Cham. 2016.
[8] Gerstner, T., & Pajarola, R. “Topology preserving and controlled topology simplifying multiresolution isosurface extraction” (pp. 259-266). IEEE, 2000.
[9] Lorensen, W. E., & Cline, H. E. “Marching cubes: A high resolution 3D surface construction algorithm” ACM siggraph computer graphics, 21(4), 163-169, 1987.
[10] Krizhevsky, A., Sutskever, I., & Hinton, G. E. “Imagenet classification with deep convolutional neural networks” In Advances in neural information processing systems (pp. 1097-1105), 2012.
[11] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C.. “Mobilenetv2: Inverted residuals and linear bottlenecks” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4510-4520). 2018.
- 39 -
[12] Wen, C., Zhang, Y., Li, Z., & Fu, Y. “Pixel2mesh++: Multi-view 3d mesh generation via deformation.” In Proceedings of the IEEE International Conference on Computer Vision (pp. 1042-1051). 2019.
[13] Yang, G., Cui, Y., Belongie, S., & Hariharan, B. “Learning single-view 3d reconstruction with limited pose supervision.” In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 86-101).2018.
[14] Achlioptas, P., Diamanti, O., Mitliagkas, I., & Guibas, L. “Learning Representations and Generative Models for 3D Point Clouds.” In International Conference on Machine Learning (pp. 40-49).2018.
[15] Kanazawa, A., Tulsiani, S., Efros, A. A., & Malik, J. “Learning category-specific mesh reconstruction from image collections.” In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 371-386).2018.
[16] Kato, H., & Harada, T. “Learning view priors for single-view 3d reconstruction.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 9778-9787). 2019.
[17] Liao, Y., Donne, S., & Geiger, A. “Deep marching cubes: Learning explicit surface representations.” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2916-2925). 2018.
[18] Rezende, D. J., Eslami, S. A., Mohamed, S., Battaglia, P., Jaderberg, M., & Heess, N. “Unsupervised learning of 3d structure from images. “In Advances in neural information processing systems (pp. 4996-5004). 2016.
[19] Fan, H., Su, H., & Guibas, L. J. “A point set generation network for 3d object reconstruction from a single image.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 605-613). 2017.
[20] Ranjan, A., Bolkart, T., Sanyal, S., & Black, M. J. “Generating 3D faces using convolutional mesh autoencoders.” In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 704-720). 2018.
[21] Niu, C., Li, J., & Xu, K. “Im2struct: Recovering 3d shape structure from a single rgb image.” In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4521-4529). 2018.

QR CODE